pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2019-12-19	Org reader: report errors properly	Albert Krewinkel	1	-2/+1
	Errors during parsing are now returned in full and no longer replaced by a custom message.
2019-11-12	Switch to new pandoc-types and use Text instead of String [API change].	despresc	1	-2/+2
	PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-03-01	Remove license boilerplate.	John MacFarlane	1	-18/+0
	The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2019-02-04	Add missing copyright notices and remove license boilerplate (#5112)	Albert Krewinkel	1	-2/+2
	Quite a few modules were missing copyright notices. This commit adds copyright notices everywhere via haddock module headers. The old license boilerplate comment is redundant with this and has been removed. Update copyright years to 2019. Closes #4592.
2018-03-18	Use NoImplicitPrelude and explicitly import Prelude.	John MacFarlane	1	-0/+2
	This seems to be necessary if we are to use our custom Prelude with ghci. Closes #4464.
2018-01-05	Update copyright notices to include 2018	Albert Krewinkel	1	-2/+2

2017-06-20	Move CR filtering from tabFilter to the readers.	John MacFarlane	1	-1/+2
	The readers previously assumed that CRs had been filtered from the input. Now we strip the CRs in the readers themselves, before parsing. (The point of this is just to simplify the parsers.) Shared now exports a new function `crFilter`. [API change] And `tabFilter` no longer filters CRs.
2017-06-10	Changed all readers to take Text instead of String.	John MacFarlane	1	-2/+5
	Readers: Renamed StringReader -> TextReader. Updated tests. API change.
2017-05-13	Update dates in copyright notices	Albert Krewinkel	1	-2/+2
	This follows the suggestions given by the FSF for GPL licensed software. <https://www.gnu.org/prep/maintain/html_node/Copyright-Notices.html>
2017-03-12	Issue warning for duplicate header identifiers.	John MacFarlane	1	-0/+2
	As noted in the previous commit, an autogenerated identifier may still coincide with an explicit identifier that is given for a header later in the document, or with an identifier on a div, span, link, or image. This commit adds a warning in this case, so users can supply an explicit identifier. * Added `DuplicateIdentifier` to LogMessage. * Modified HTML, Org, MediaWiki readers so their custom state type is an instance of HasLogMessages. This is necessary for `registerHeader` to issue warnings. See #1745.
2017-03-04	Stylish-haskell automatic formatting changes.	John MacFarlane	1	-9/+9

2017-01-25	Unify Errors.	Jesse Rosenthal	1	-1/+2

2017-01-25	Working on readers.	Jesse Rosenthal	1	-7/+13

2016-07-01	Org reader: refactor comment tree handling	Albert Krewinkel	1	-38/+1
	Comment trees were handled after parsing, as pattern matching on lists is easier than matching on sequences. The new method of reading documents as trees allows for more elegant subtree removal.
2016-06-03	Org reader: support smart quotes export option	Albert Krewinkel	1	-2/+3
	Reading of smart quotes can be toggled using the `'` option.
2016-05-25	Org reader: extract blocks parser to module	Albert Krewinkel	1	-844/+9
	Block parsing code is moved to a separate module. This is part of the Org-mode reader cleanup effort.
2016-05-25	Org reader: extract inline parser to module	Albert Krewinkel	1	-756/+41
	Inline parsing code is moved to a separate module. Parsers for block starts are extracted as well, as those are used in the `endline` parser. This is part of the Org-mode reader cleanup effort.
2016-05-25	Org reader: extract parsing function to module	Albert Krewinkel	1	-76/+10
	The Org-mode reader uses many functions defined in the `Text.Pandoc.Parsing` utility module. Some of the functions are overwritten with versions adapted to Org-mode idiosyncrasies. These special functions, as well as the normal Pandoc versions, are combined in a single module to increase the ease of use. This leads to decoupling of Org-mode and Pandoc and hence to slightly cleaner code. The downside is code-bloat due to repeated import/export statements.
2016-05-23	Org reader: respect drawer export setting	Albert Krewinkel	1	-11/+65
	The `d` export option can be used to control which drawers are exported and which are discarded. Basic support for this option is added here.
2016-05-22	Org reader/writer: use CUSTOM_ID in properties	Albert Krewinkel	1	-3/+4
	The `ID` property is reserved for internal use by Org-mode and should not be used. The `CUSTOM_ID` property is to be used instead, it is converted to the `ID` property for certain export format. The reader and writer erroneously used `ID`. This is corrected by using `CUSTOM_ID` where appropriate.
2016-05-20	Org reader: add :PROPERTIES: drawer support	Albert Krewinkel	1	-28/+56
	Headers can have optional `:PROPERTIES:` drawers associated with them. These drawers contain key/value pairs like the header's `id`. The reader adds all listed pairs to the header's attributes; `id` and `class` attributes are handled specially to match the way `Attr` are defined. This also changes behavior of how drawers of unknown type are handled. Instead of including all unknown drawers, those are not read/exported, thereby matching current Emacs behavior. This closes #1877.
2016-05-19	Org reader: add support for ATTR_HTML attributes	Albert Krewinkel	1	-7/+28
	Arbitrary key-value pairs can be added to some block types using a `#+ATTR_HTML` line before the block. Emacs Org-mode only includes these when exporting to HTML, but since we cannot make this distinction here, the attributes are always added. The functionality is now supported for figures. This closes #1906.
2016-05-19	Org reader: use custom `anyLine`	Albert Krewinkel	1	-3/+10
	Additional state changes need to be made after a newline is parsed, otherwise markup may not be recognized correctly. This fixes a bug where markup after certain block-types would not be recognized. E.g. `/emph/` in the following snippet was not parsed as emphasized. foo # comment /emph/
2016-05-19	Org reader: refactor block attribute handling	Albert Krewinkel	1	-79/+77
	A parser state attribute was used to keep track of block attributes defined in meta-lines. Global state is undesirable, so block attributes are no longer saved as part of the parser state. Old functions and the respective part of the parser state are removed.
2016-05-11	Org reader: parse but ignore export options	Albert Krewinkel	1	-2/+35
	All known export options are parsed but ignored.
2016-05-11	Org reader: add support for sub/superscript export options	Albert Krewinkel	1	-3/+25
	Org-mode allows to specify export settings via `#+OPTIONS` lines. Disabling simple sub- and superscripts is one of these export options, this options is now supported.
2016-05-11	Org reader: move parser state into separate module	Albert Krewinkel	1	-158/+57
	The org reader code has become large and confusing. Extracting smaller parts into submodules should help to clean things up.
2016-05-09	Org reader: fix inline-LaTeX regression	Albert Krewinkel	1	-4/+9
	The last fix for whitespace handling of inline LaTeX commands was incorrect, preventing correct recognition of inline LaTeX commands which contain spaces. This fix ensures that only trailing whitespace is cut off.
2016-05-05	Merge pull request #2898 from tarleb/org-table-refactoring	John MacFarlane	1	-59/+59
	Org reader: table parsing code refactoring and fixes
2016-05-04	Org reader: fix spacing after LaTeX-style symbols	Albert Krewinkel	1	-5/+7
	The org-reader was droping space after unescaped LaTeX-style symbol commands: `\ForAll \Auml` resulted in `∀Ä` but should give `∀ Ä` instead. This seems to be because the LaTeX-reader treats the command-terminating space as part of the command. Dropping the trailing space from the symbol-command fixes this issue.
2016-05-04	Org reader: fix handling of empty table cells, rows	Albert Krewinkel	1	-13/+17
	This fixes Org mode parsing of some corner cases regarding empty cells and rows. Empty cells weren't parsed correctly, e.g. `\|\|\|` should be two empty cells, but would be parsed as a single cell containing a pipe character. Empty rows where parsed as alignment rows and dropped from the output. This fixes #2616.
2016-05-04	Org reader: refactor rows-to-table conversion	Albert Krewinkel	1	-25/+25
	This refactores the codes conversing a list table lines to an org table ADT. The old code was simplified and is now slightly less ugly.
2016-05-04	Org reader: stop padding short table rows	Albert Krewinkel	1	-24/+20
	Emacs Org-mode doesn't add any padding to table rows. The first row (header or first body row) is used to determine the column count, no other magic is performed. The org reader was padding rows to the length of the longest table row. This was done due to a misunderstanding of how Org handles tables. This feature reflected how Org-mode handles tables when pressing <TAB>. The Org exporter however, which is what the reader should implement, doesn't do any of this. So this was a mis-feature that made the reader more complex and reduced comparability. It was hence removed.
2016-04-26	Ignore leading space in org code blocks	Emanuel Evans	1	-4/+20
	Fixes #2862 Also fix up tab handling for leading whitespace in code blocks.
2016-02-20	Merge pull request #2646 from tarleb/org-figure-with-no-name	John MacFarlane	1	-3/+3
	Prefix even empty figure names with "fig:"
2016-01-31	Org reader: Refactor link-target processing	Albert Krewinkel	1	-29/+29
	Cleanup of the code for link target handling. Most notably, the canonicalization of a link is handled by a separate function. This fixes #2684.
2016-01-22	Changed type of Shared.uniqueIdent argument from [String] to Set String.	John MacFarlane	1	-2/+3
	This avoids performance problems in documents with many identically named headers. Closes #2671.
2016-01-11	Prefix even empty figure names with "fig:"	Albert Krewinkel	1	-3/+3
	The convention used by pandoc for figures is to mark them by prefixing the name with "fig:". The org reader failed to do this if a figure had no name. The test for this was broken as well. This fixes #2643.
2016-01-07	Fix function dropping subtrees tagged :noexport:	Albert Krewinkel	1	-2/+4
	Continue scanning for comment subtrees beyond only the first block. Note to self: when writing an recursive function, don't forget to, you know, actually recurse. Shout to @mrvdb for noticing this. This fixes #2628.
2015-12-12	Modified readers to emit SoftBreak when appropriate.	John MacFarlane	1	-1/+1

2015-11-13	Merge pull request #2526 from tarleb/org-definition-lists-fix	John MacFarlane	1	-1/+4
	Org reader: Require whitespace around def list markers
2015-11-13	Org reader: Require whitespace around def list markers	Albert Krewinkel	1	-1/+4
	Definition list markers (i.e. double colons `::`) must be surrounded by whitespace to start a definition item. This rule was not checked before, resulting in bugs with footnotes and some link types. Thanks to @conklech for noticing and reporting this issue. This fixes #2518.
2015-11-13	Org reader: Fix emphasis rules for smart parsing	Albert Krewinkel	1	-4/+9
	Smart quotes, ellipses, and dashes should behave like normal quotes, single dashes, and dots with respect to text markup parsing. The parser state was not updated properly in all cases, which has been fixed. Thanks to @conklech for reporting this issue. This fixes #2513.
2015-11-09	Restored Text.Pandoc.Compat.Monoid.	John MacFarlane	1	-0/+1
	Don't use custom prelude for latest ghc. This is a better approach to making 'stack ghci' and 'cabal repl' work. Instead of using NoImplicitPrelude, we only use the custom prelude for older ghc versions. The custom prelude presents a uniform API that matches the current base version's prelude. So, when developing (presumably with latest ghc), we don't use a custom prelude at all and hence have no trouble with ghci. The custom prelude no longer exports (<>): we now want to match the base 4.8 prelude behavior.
2015-11-09	Revert "Use -XNoImplicitPrelude and 'import Prelude' explicitly."	John MacFarlane	1	-1/+0
	This reverts commit c423dbb5a34c2d1195020e0f0ca3aae883d0749b.
2015-11-08	Merge pull request #2505 from tarleb/org-header-markup-fix	John MacFarlane	1	-1/+1
	Org reader: fix markup parsing in headers
2015-11-08	Use -XNoImplicitPrelude and 'import Prelude' explicitly.	John MacFarlane	1	-0/+1
	This is needed for ghci to work with pandoc, given that we now use a custom prelude. Closes #2503.
2015-11-08	Org reader: fix markup parsing in headers	Albert Krewinkel	1	-1/+1
	Markup as the very first item in a header wasn't recognized. This was caused by an incorrect parser state: positions at which inline markup can start need to be marked explicitly by changing the parser state. This wasn't done for headers. The proper function to update the state is now called at the beginning of the header parser, fixing this issue. This fixes #2504.
2015-10-25	Merge pull request #2477 from tarleb/org-toggling-header-args	John MacFarlane	1	-4/+16
	Org reader: allow toggling header args
2015-10-25	Org reader: allow toggling header args	Albert Krewinkel	1	-4/+16
	Org-mode allows to skip the argument of a code block header argument if it's toggling a value. Argument-less headers are now recognized, avoiding weird parsing errors. The fixes are not exactly pretty, but neither is the code that was fixed. So I guess it's about par for the course. However, a rewrite of the header parsing code wouldn't hurt in the long run. Thanks to @jo-tham for filing the bug report. This fixes #2269.