aboutsummaryrefslogtreecommitdiff
path: root/src/Text/Pandoc/Readers/LaTeX/Parsing.hs
AgeCommit message (Collapse)AuthorFilesLines
2021-09-19LaTeX reader: Recognize that `\vadjust` sometimes takes "pre".John MacFarlane1-0/+7
Closes #7531.
2021-08-21LaTeX-parser: restrict \endinput to current fileSimon Schuster1-0/+4
2021-08-11Fix some lint issues.John MacFarlane1-2/+2
2021-08-11Fix scope for LaTeX macros.John MacFarlane1-12/+37
They should by default scope over the group in which they are defined (except `\gdef` and `\xdef`, which are global). In addition, environments must be treated as groups. We handle this by making sMacros in the LaTeX parser state a STACK of macro tables. Opening a group adds a table to the stack, closing one removes one. Only the top of the stack is queried. This commit adds a parameter for scope to the Macro constructor (not exported). Closes #7494.
2021-08-11LaTeX reader: improve handling of plain TeX macro primitives.John MacFarlane1-2/+2
- Fixed semantics for `\let`. - Implement `\edef`, `\gdef`, and `\xdef`. - Add comment noting that currently `\def` and `\edef` set global macros (so are equivalent to `\gdef` and `\xdef`). This should be fixed by scoping macro definitions to groups, in a future commit. Closes #7474.
2021-07-11Improved parsing of raw LaTeX from Text streams (rawLaTeXParser).John MacFarlane1-5/+34
We now use source positions from the token stream to tell us how much of the text stream to consume. Getting this to work required a few other changes to make token source positions accurate. Closes #7434.
2021-05-27LaTeX reader: improve `\def` and implement `\newif`.John MacFarlane1-1/+18
- Improve parsing of `\def` macros. We previously set "verbatim mode" even for parsing the initial `\def`; this caused problems for things like ``` \def\foo{\def\bar{BAR}} \foo \bar ``` - Implement `\newif`. - Add tests.
2021-05-20LaTeX reader: More siunitx improvements. Closes #6658.John MacFarlane1-1/+2
There's still one slight divergence from the siunitx behavior: we get 'kg m/A/s' instead of 'kg m/(A s)'. At the moment I'm not going to worry about that.
2021-05-19LaTeX reader: better support for `\xspace`.John MacFarlane1-2/+19
Previously we only supported it in inline contexts; now we support it in all contexts, including math. Partially addresses #7299.
2021-05-09Change reader types, allowing better tracking of source positions.John MacFarlane1-3/+9
Previously, when multiple file arguments were provided, pandoc simply concatenated them and passed the contents to the readers, which took a Text argument. As a result, the readers had no way of knowing which file was the source of any particular bit of text. This meant that we couldn't report accurate source positions on errors or include accurate source positions as attributes in the AST. More seriously, it meant that we couldn't resolve resource paths relative to the files containing them (see e.g. #5501, #6632, #6384, #3752). Add Text.Pandoc.Sources (exported module), with a `Sources` type and a `ToSources` class. A `Sources` wraps a list of `(SourcePos, Text)` pairs. [API change] A parsec `Stream` instance is provided for `Sources`. The module also exports versions of parsec's `satisfy` and other Char parsers that track source positions accurately from a `Sources` stream (or any instance of the new `UpdateSourcePos` class). Text.Pandoc.Parsing now exports these modified Char parsers instead of the ones parsec provides. Modified parsers to use a `Sources` as stream [API change]. The readers that previously took a `Text` argument have been modified to take any instance of `ToSources`. So, they may still be used with a `Text`, but they can also be used with a `Sources` object. In Text.Pandoc.Error, modified the constructor PandocParsecError to take a `Sources` rather than a `Text` as first argument, so parse error locations can be accurately reported. T.P.Error: showPos, do not print "-" as source name.
2021-02-28T.P.Readers.LaTeX: Don't export tokenize, untokenize.John MacFarlane1-0/+9
[API change] These were only exported for testing, which seems the wrong thing to do. They don't belong in the public API and are not really usable as they are, without access to the Tok type which is not exported. Removed the tokenize/untokenize roundtrip test. We put a quickcheck property in the comments which may be used when this code is touched (if it is).
2021-02-28Factor out T.P.Readers.LaTeX.Citation.John MacFarlane1-0/+5
2021-02-27Factor out T.P.Readers.LaTeX.Table.John MacFarlane1-0/+33
2021-02-21LaTeX reader: further optimizations in satisfyTok.John MacFarlane1-5/+5
Benchmarks show 2/3 of the run time and 2/3 of the allocation of the Feb. 10 benchmarks.
2021-02-21LaTeX reader: removed sExpanded in state.John MacFarlane1-7/+2
This isn't actually needed and checking it doesn't change anything. Also remove an unnecessary `doMacros` before `satisfyTok`, which does it anyway.
2021-02-21LaTeX reader: further performance optimization.John MacFarlane1-23/+19
Avoid unnecessary 'doMacros'.
2021-02-20LaTeX reader: Another small improvement to macro handling.John MacFarlane1-4/+3
2021-02-20LaTeX reader: avoid macro resolution code if no macros defined.John MacFarlane1-16/+19
2021-02-20T.P.Readers.LaTeX.Parsing: improve braced'.John MacFarlane1-16/+13
Remove the parameter, have it parse the opening brace, and make it more efficient.
2021-02-12LaTeX reader improvements.John MacFarlane1-18/+66
* Rewrote `withRaw` so it doesn't rely on fragile assumptions about token positions (which break when macros are expanded). This requires the addition of `sEnableWithRaw` and `sRawTokens` in `LaTeXState`, and a new combinator `disablingWithRaw` to disable collecting of raw tokens in certain contexts. * Add `parseFromToks` to T.P.Readers.LaTeX.Parsing. * Fix parsing of single character tokens so it doesn't mess up the new raw token collecting. * These changes slightly increase allocations and have a small performance impact, but it's minor. Closes #7092.
2021-01-08Update copyright notices for 2021 (#7012)Albert Krewinkel1-1/+1
2021-01-04LaTeX reader: handle filecontents environment.John MacFarlane1-0/+2
Closes #7003.
2020-11-16Move getNextNumber from Readers.LaTeX to Readers.LaTeX.Parsing.John MacFarlane1-0/+26
2020-11-02LaTeX reader: fix bug parsing macro arguments.John MacFarlane1-1/+5
If `\cL` is defined as `\mathcal{L}`, and `\til` as `\tilde{#1}`, then `\til\cL` should expand to `\tilde{\mathcal{L}}`, but pandoc was expanding it to `\tilde\mathcal{L}`. This is fixed by parsing the arguments in "verbatim mode" when the macro expands arguments at the point of use. Closes #6796.
2020-10-08LaTeX reader: Fix parsing of "show name" in newtheorem.John MacFarlane1-1/+1
Previously we were just treating it as a string and ignoring accents and formatting. See #6734.
2020-09-13Fix hlint suggestions, update hlint.yaml (#6680)Christian Despres1-8/+6
* Fix hlint suggestions, update hlint.yaml Most suggestions were redundant brackets. Some required LambdaCase. The .hlint.yaml file had a small typo, and didn't ignore camelCase suggestions in certain modules.
2020-07-22LaTeX reader: SUpport ams `\theoremstyle`.John MacFarlane1-2/+10
2020-07-22LaTeX reader: support theorem environments and `\newtheorem`.John MacFarlane1-0/+2
Includes numbering and labels and refs. Note that numbering support is not complete; we don't reset numbers with sections for example.
2020-07-22LaTeX reader: support ams proof environment.John MacFarlane1-0/+10
2020-07-22Moved more from LaTeX reader to LaTeX.Parsing.John MacFarlane1-0/+67
2020-07-20Move some code from T.P.R.LaTeX. to T.P.R.LaTeX.Parsing.John MacFarlane1-0/+64
We need to reduce the size of the LaTeX reader to ease compilation on resource-limited systems. More can be done in this vein.
2020-03-22Finer grained imports of Text.Pandoc.Class submodules (#6203)Albert Krewinkel1-1/+1
This should speed-up recompilation after changes in `Text.Pandoc.Class`, as the number of modules affected by a change will be smaller in general. It also offers faster insights into the parts of `T.P.Class` used within a module.
2020-03-15Use implicit Prelude (#6187)Albert Krewinkel1-2/+0
* Use implicit Prelude The previous behavior was introduced as a fix for #4464. It seems that this change alone did not fix the issue, and `stack ghci` and `cabal repl` only work with GHC 8.4.1 or newer, as no custom Prelude is loaded for these versions. Given this, it seems cleaner to revert to the implicit Prelude. * PandocMonad: remove outdated check for base version Only base versions 4.9 and later are supported, the check for `MIN_VERSION_base(4,8,0)` is therefore unnecessary. * Always use custom prelude Previously, the custom prelude was used only with older GHC versions, as a workaround for problems with ghci. The ghci problems are resolved by replacing package `base` with `base-noprelude`, allowing for consistent use of the custom prelude across all GHC versions.
2020-03-13Update copyright year (#6186)Albert Krewinkel1-1/+1
* Update copyright year * Copyright: add notes for Lua and Jira modules
2020-02-12LaTeX reader: improve caption and label parsing.John MacFarlane1-2/+4
- Don't emit empty Span elements for labels. - Put tables with labels in a surrounding Div.
2020-02-11LaTeX reader: resolve `\ref` to table numbers.John MacFarlane1-0/+2
Closes #6137.
2020-02-07Resolve HLint warningsAlbert Krewinkel1-2/+2
All warnings are either fixed or, if more appropriate, HLint is configured to ignore them. HLint suggestions remain. * Ignore "Use camelCase" warnings in Lua and legacy code * Fix or ignore remaining HLint warnings * Remove redundant brackets * Remove redundant `return`s * Remove redundant as-pattern * Fuse mapM_/map * Use `.` to shorten code * Remove redundant `fmap` * Remove unused LANGUAGE pragmas * Hoist `not` in Text.Pandoc.App * Use fewer imports for `Text.DocTemplates` * Remove redundant `do`s * Remove redundant `$`s * Jira reader: remove unnecessary parentheses
2020-02-05LaTeX reader: skip comments in more places where this is needed.John MacFarlane1-2/+4
Closes #6114.
2019-11-12Switch to new pandoc-types and use Text instead of String [API change].despresc1-16/+16
PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-11-02LaTeX untokenize: Ensure space between control sequence and following letter.John MacFarlane1-2/+14
Closes #5836.
2019-10-23T.P.Readers.LaTeX.Parsing: add `[Tok]` parameter to rawLaTeXParser.John MacFarlane1-4/+3
This allows us to avoid retokenizing multiple times in e.g. rawLaTeXBlock. (Unexported module, so not an API change.)
2019-09-28Use Prelude.fail to avoid ambiguity with fail from GHC.Base.John MacFarlane1-5/+5
2019-09-09LaTeX reader: Fix parsing of optional arguments that contain braced text.John MacFarlane1-4/+3
Closes #5740.
2019-09-02LaTeX reader: properly handle optional arguments for macros.John MacFarlane1-1/+1
Closes #5682.
2019-08-14LaTeX reader: improve withRaw so it can handle cases where...John MacFarlane1-2/+3
the token string is modified by a parser (e.g. accent when it only takes part of a Word token). Closes #5686. Still not ideal, because we get the whole `\t0BAR` and not just `\t0` as a raw latex inline command. But I'm willing to let this be an edge case, since you can easily work around this by inserting a space, braces, or raw attribute. The important thing is that we no longer drop the rest of the document after a raw latex inline command that gobbles only part of a Word token!
2019-07-19Markdown: Ensure that expanded latex macros end with space if original did.John MacFarlane1-1/+10
Closes #4442.
2019-07-16LaTeX reader: handle \looseness command values better.John MacFarlane1-5/+4
Closes #4439.
2019-03-01Remove license boilerplate.John MacFarlane1-18/+0
The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2019-02-04Add missing copyright notices and remove license boilerplate (#5112)Albert Krewinkel1-2/+2
Quite a few modules were missing copyright notices. This commit adds copyright notices everywhere via haddock module headers. The old license boilerplate comment is redundant with this and has been removed. Update copyright years to 2019. Closes #4592.
2019-01-31LaTeX reader: don't let `\egroup` match `{`.John MacFarlane1-3/+3
`braced` now actually requires nested braces. Otherwise some legitimate command and environment definitions can break (see test/command/tex-group.md).