aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/LaTeX.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-07-11Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).John MacFarlane1-6/+3
We now use source positions from the token stream to tell us how much of the text stream to consume. Getting this to work required a few other changes to make token source positions accurate. Closes #7434.
2021-06-01LaTeX reader: don't allow optional * on symbol control sequences.John MacFarlane1-2/+4
Generally we allow optional starred variants of LaTeX commands (since many allow them, and if we don't accept these explicitly, ignoring the star usually gives acceptable results). But we don't want to do this for `\(*\)` and similar cases. Closes #7340.
2021-05-19LaTeX reader: better support for `\xspace`.John MacFarlane1-12/+0
Previously we only supported it in inline contexts; now we support it in all contexts, including math. Partially addresses #7299.
2021-05-09Change reader types, allowing better tracking of source positions.John MacFarlane1-11/+12
Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-04-25Minor code reformatting.John MacFarlane1-1/+2
Also taking this opportunity to note, for the record, that the commit for #7241 should be marked [API change]. It changes the type of `languagesByExtension` in Highlighting, adding a parameter for a `SyntaxMap`.
2021-04-25Writers: Recognize custom syntax definitions (#7241)Jan Tojnar1-1/+2
Languages defined using `--syntax-definition` were not recognized by `languagesByExtension`. This patch corrects that, allowing the writers to see all custom definitions. The LaTeX still uses the default syntax map, but that's okay in that context, since `--syntax-definition` won't create new listings styles.
2021-04-17Update to released unicode-collation, latest citeproc dev version.John MacFarlane1-1/+1
Update citeproc test.
2021-04-17Remove Text.Pandoc.BCP47 module.John MacFarlane1-1/+1
[API change] Use Lang from UnicodeCollation.Lang instead. This is a richer implementation of BCP 47.
2021-03-19Use NonEmpty instead of minimumDef.John MacFarlane1-2/+2
2021-03-18Use minimumDef instead of minimum (partial function).John MacFarlane1-1/+1
2021-03-18Require safe >= 0.3.18 and remove cpp.John MacFarlane1-5/+0
2021-03-07LaTeX reader: support hyperref command.John MacFarlane1-4/+13
Closes #7127.
2021-03-03Revert "Add T.P.Readers.LaTeX.Include."John MacFarlane1-7/+50
This reverts commit b569b0226d4bd5e0699077089d54fb03d4394b7d. Memory usage improvement in compilation wasn't very significant.
2021-03-03Add T.P.Readers.LaTeX.Include.John MacFarlane1-50/+7
2021-03-03Remove T.P.Readers.LaTeX.Accent.John MacFarlane1-1/+1
Incorporate accentCommands into T.P.Readers.LaTeX.Inline.
2021-03-03Move enquote commands to T.P.LaTeX.Lang.John MacFarlane1-20/+2
2021-03-03Moved more into T.P.Readers.LaTeX.Lang.John MacFarlane1-78/+6
2021-03-03Split out T.P.Readers.LaTeX.Inline.John MacFarlane1-336/+138
2021-03-01Make T.P.Readers.LaTeX.Types an unexported module.John MacFarlane1-1/+1
[API change] This is really an implementation detail that shouldn't be exposed in the public API.
2021-03-01Factor out T.P.Readers.LaTeX.Macro.John MacFarlane1-139/+2
2021-02-28Removed unnecessary pragmas.John MacFarlane1-2/+0
2021-02-28Change T.P.Readers.LaTeX.SIunitx to export a command map...John MacFarlane1-9/+2
instead of individual commands.
2021-02-28T.P.Readers.LaTeX: Don't export tokenize, untokenize.John MacFarlane1-2/+0
[API change] These were only exported for testing, which seems the wrong thing to do. They don't belong in the public API and are not really usable as they are, without access to the Tok type which is not exported. Removed the tokenize/untokenize roundtrip test. We put a quickcheck property in the comments which may be used when this code is touched (if it is).
2021-02-28Factor out T.P.Readers.LaTeX.Math.John MacFarlane1-193/+8
2021-02-28LaTeX reader: another small efficiency improvement.John MacFarlane1-6/+12
2021-02-28LaTeX reader efficiency improvements.John MacFarlane1-31/+42
In conjunction with other changes this makes the reader almost twice as fast on our benchmark as it was on Feb. 10.
2021-02-28Move setDefaultLanguage to T.P.Readers.LaTeX.Lang.John MacFarlane1-14/+2
2021-02-28LaTeX reader: remove two unnecessary parsers in inline.John MacFarlane1-2/+0
These are handled anyway by regularSymbol.
2021-02-28Factor out T.P.Readers.LaTeX.Citation.John MacFarlane1-186/+16
2021-02-27Factor out T.P.Readers.LaTeX.Table.John MacFarlane1-363/+5
2021-02-27Split off T.P.Readers.LaTeX.Accent.John MacFarlane1-60/+8
To help reduce memory demands compiling the main LaTeX reader.
2021-02-13LaTeX reader: remove unnecessary lineJohn MacFarlane1-1/+0
2021-02-12Avoid an unnecessary withRaw.John MacFarlane1-1/+4
2021-02-12LaTeX reader improvements.John MacFarlane1-4/+2
* Rewrote `withRaw` so it doesn't rely on fragile assumptions about token positions (which break when macros are expanded). This requires the addition of `sEnableWithRaw` and `sRawTokens` in `LaTeXState`, and a new combinator `disablingWithRaw` to disable collecting of raw tokens in certain contexts. * Add `parseFromToks` to T.P.Readers.LaTeX.Parsing. * Fix parsing of single character tokens so it doesn't mess up the new raw token collecting. * These changes slightly increase allocations and have a small performance impact, but it's minor. Closes #7092.
2021-01-26Clean up BibTeX parsing.John MacFarlane1-0/+18
Previously there was a messy code path that gave strange results in some cases, not passing through raw tex but trying to extract a string content. This was an artefact of trying to handle some special bibtex-specific commands in the BibTeX reader. Now we just handle these in the LaTeX reader and simplify parsing in the BibTeX reader. This does mean that more raw tex will be passed through (and currently this is not sensitive to the `raw_tex` extension; this should be fixed). Closes #7049.
2021-01-08Update copyright notices for 2021 (#7012)Albert Krewinkel1-1/+1
2021-01-04LaTeX reader: handle filecontents environment.John MacFarlane1-6/+26
Closes #7003.
2021-01-02LaTeX reader: put contents of unknown environments in a Div...John MacFarlane1-1/+1
when `raw_tex` is not enabled. (When `raw_tex` is enabled, the whole environment is parsed as a raw block.) The class name is the name of the environment. Previously, we just included the contents without the surrounding Div, but having a record of the environment's boundaries and name can be useful. Closes #6997.
2020-12-05LaTeX reader: don't apply theorem default styling to a figure inside.John MacFarlane1-0/+1
If we put an image in italics, then when rendering to Markdown we no longer get an implicit figure. Closes #6925.
2020-11-29LaTeX reader: don't parse `\rule` with width 0 as horizontal rule.John MacFarlane1-1/+11
2020-11-26LaTeX reader: preserve center environment (#6852)Igor Pashev1-1/+1
The contents of the `center` environment are put in a `Div` with class `center`.
2020-11-21LaTeX reader: more robust parsing of bracketed options.John MacFarlane1-3/+8
Improves on 9a40976. Closes #6873.
2020-11-20Improve LaTeX option parsing...John MacFarlane1-1/+3
in cases where we run into trouble parsing inlines til the closing `]`, e.g. quotes, we return a plain string with the option contents. Previously we mistakenly included the brackets in this string. Closes #6869.
2020-11-16Move getNextNumber from Readers.LaTeX to Readers.LaTeX.Parsing.John MacFarlane1-26/+0
2020-11-05LaTeX reader: better handling of `\\` inside math in table cells.John MacFarlane1-0/+2
Previously this confused the table parser. Closes #6811.
2020-11-03Properly support optional cite argument for `\blockquote`.John MacFarlane1-7/+8
(LaTeX reader) Closes #6802.
2020-10-13LaTeX reader: support more acronym commands.John MacFarlane1-0/+10
`\acl`, `\aclp`, and capitalized versions of already supported commands. Closes #6746.
2020-10-10LaTeX reader: allow blank lines inside `\author`.John MacFarlane1-6/+3
2020-10-08LaTeX reader: Fix parsing of "show name" in newtheorem.John MacFarlane1-5/+6
Previously we were just treating it as a string and ignoring accents and formatting. See #6734.
2020-09-19Change deprecated Builder.isNull to null.John MacFarlane1-1/+1