pandoc - Conversion between markup formats

Age	Commit message (Collapse)	Author	Files	Lines
2019-12-17	HTML reader: Add "nav" to list of block-level tags.	John MacFarlane	1	-1/+2

2019-11-12	Switch to new pandoc-types and use Text instead of String [API change].	despresc	1	-119/+112
	PR #5884. + Use pandoc-types 1.20 and texmath 0.12. + Text is now used instead of String, with a few exceptions. + In the MediaBag module, some of the types using Strings were switched to use FilePath instead (not Text). + In the Parsing module, new parsers `manyChar`, `many1Char`, `manyTillChar`, `many1TillChar`, `many1Till`, `manyUntil`, `mantyUntilChar` have been added: these are like their unsuffixed counterparts but pack some or all of their output. + `glob` in Text.Pandoc.Class still takes String since it seems to be intended as an interface to Glob, which uses strings. It seems to be used only once in the package, in the EPUB writer, so that is not hard to change.
2019-11-11	Change the implementation of `htmlSpanLikeElements` and implement `<dfn>` ↵	Florian Beeres	1	-4/+11
	(#5882) * Add HTML Reader support for `<dfn>`, parsing this as a Span with class `dfn`. * Change `htmlSpanLikeElements` implementation to retain classes, attributes and inline content.
2019-11-04	Removed an unnecessary unpack.	John MacFarlane	1	-1/+1

2019-11-04	HTML Reader/Writer - Add support for <var> and <samp> (#5861)	Amogh Rathore	1	-5/+7
	Closes #5799
2019-10-24	HTML reader/writer: Better handling of <q> with cite attribute (#5837)	Ole Martin Ruud	1	-23/+34
	* HTML reader: Handle cite attribute for quotes. If a `<q>` tag has a `cite` attribute, we interpret it as a Quoted element with an inner Span. Closes #5798 * Refactor url canonicalization into a helper function * Modify HTML writer to handle quote with cite. [0]: https://developer.mozilla.org/en-US/docs/Web/HTML/Element/q
2019-10-23	Add Reader support for HTML <samp> element (#5843)	Amogh Rathore	1	-0/+9
	The `<samp>` element is parsed as a Span with class `sample`. Closes #5792.
2019-10-15	Add support for reading and writing <kbd> elements	Daniele D'Orazio	1	-1/+9
	* Text.Pandoc.Shared: export `htmlSpanLikeElements` [API change] This commit also introduces a mapping of HTML span like elements that are internally represented as a Span with a single class, but that are converted back to the original element by the html writer. As of now, only the kbd element is handled this way. Ideally these elements should be handled as plain AST values, but since that would be a breaking change with a large impact, we revert to this stop-gap solution. Fixes https://github.com/jgm/pandoc/issues/5796.
2019-09-28	Use Prelude.fail to avoid ambiguity with fail from GHC.Base.	John MacFarlane	1	-2/+2

2019-07-02	Fix redundant constraint warnings. (#5625)	Pete Ryland	1	-2/+2

2019-05-29	HTML reader: misc. epub related fixes.	John MacFarlane	1	-30/+41
	- With epub extensions, check for epub:type in addition to type. - Fix problem with noteref parsing which caused block-level content to be eaten with the noteref. - Rename pAnyTag to pAny. - Refactor note resolution.
2019-05-27	consolidate simple-table detection (#5524)	Mauro Bieg	1	-7/+2
	add `onlySimpleTableCells` to `Text.Pandoc.Shared` [API change] This fixes an inconsistency in the HTML reader, which did not treat tables with `<p>` inside cells as simple.
2019-05-25	HTML reader: trim definition list terms	Alexander Krotov	1	-1/+1

2019-03-25	HTML reader: read `data-foo` attribute into `foo`.	John MacFarlane	1	-1/+2
	The HTML writer adds the `data-` prefix for HTML5 for nonstandard attributes. But the attributes are represented in the AST without the `data-` prefix, so we should strip this when reading HTML. Closes #5392.
2019-03-01	Remove license boilerplate.	John MacFarlane	1	-18/+0
	The haddock module header contains essentially the same information, so the boilerplate is redundant and just one more thing to get out of sync.
2019-02-04	Add missing copyright notices and remove license boilerplate (#5112)	Albert Krewinkel	1	-2/+2
	Quite a few modules were missing copyright notices. This commit adds copyright notices everywhere via haddock module headers. The old license boilerplate comment is redundant with this and has been removed. Update copyright years to 2019. Closes #4592.
2019-01-21	HTML and markdown: treat textarea as a verbatim environment.	John MacFarlane	1	-1/+3
	We don't want to parse its contents as Markdown or HTML. Closes #5241.
2018-12-31	Remove unused HasHeaderMap (#5175)	Alexander	1	-6/+1
	It is updated by some readers, but never actually used.
2018-12-17	HTML reader: handle empty start attribute.	John MacFarlane	1	-4/+2
	See #5162.
2018-11-16	HTML reader: allow tfoot before body rows.	John MacFarlane	1	-2/+3
	Closes #5079.
2018-11-15	HTML reader: parse `<small>` as a Span with class "small".	John MacFarlane	1	-0/+4
	Closes #5080.
2018-11-13	HTML reader: allow thead containing a row with td rather than th.	John MacFarlane	1	-11/+11
	See #5014. Note that this doesn't address the original issue in #5014, only an unrelated side-issue.
2018-10-11	HTML reader: fix htmlTag and isInlineTag to accept processing instructions.	John MacFarlane	1	-8/+10
	Fixes regression #3123 (since 2.0). Added regression test.
2018-09-07	HTML reader: parse `<script type="math/tex` tags as math.	John MacFarlane	1	-0/+12
	These are used by MathJax. Closes #4877.
2018-08-24	HTML reader: allow enabling `raw_tex` extension.	John MacFarlane	1	-3/+28
	This now allows raw LaTeX environments, `\ref`, and `\eqref` to be parsed (which is helpful for translation HTML documents using MathJaX). Closes #1126.
2018-08-22	HTML reader: extract spaces inside links instead of trimming them	Alexander Krotov	1	-3/+3
	Fixes #4845
2018-07-02	Spellcheck comments	Alexander Krotov	1	-2/+2

2018-04-05	Changes to tests to accommodate changes in pandoc-types.	John MacFarlane	1	-2/+4
	In https://github.com/jgm/pandoc-types/pull/36 we changed the table builder to pad cells. This commit changes tests (and two readers) to accord with this behavior.
2018-03-18	Use NoImplicitPrelude and explicitly import Prelude.	John MacFarlane	1	-0/+2
	This seems to be necessary if we are to use our custom Prelude with ghci. Closes #4464.
2018-03-16	Monoid/Semiground cleanup relying on custom Prelude.	John MacFarlane	1	-1/+1

2018-01-19	hlint code improvements.	John MacFarlane	1	-4/+4

2018-01-15	HTML reader: Fix col width parsing for percentages < 10% (#4262)	n3fariox	1	-3/+6
	Rather than take user input, and place a "0." in front, actually calculate the percentage to catch cases where small column sizes (e.g. `2%`) are needed.
2018-01-05	Update copyright notices to include 2018	Albert Krewinkel	1	-2/+2

2017-12-27	Fix warning.	John MacFarlane	1	-2/+1

2017-12-27	Small improvement to figcaption parsing. #4184.	John MacFarlane	1	-2/+0

2017-12-27	Merge pull request #4184 from mb21/html-reader-figcaption	John MacFarlane	1	-4/+7
	HTML Reader: be more forgiving about figcaption
2017-12-27	HTML reader: parse div with class `line-block` as LineBlock.	John MacFarlane	1	-1/+13
	See #4162.
2017-12-23	HTML Reader: be more forgiving about figcaption	mb21	1	-4/+7
	fixes #4183
2017-12-06	Markdown reader: accept processing instructions as raw HTML.	John MacFarlane	1	-2/+3
	Closes #4125.
2017-12-04	Add `empty_paragraphs` extension.	John MacFarlane	1	-4/+9
	* Deprecate `--strip-empty-paragraphs` option. Instead we now use an `empty_paragraphs` extension that can be enabled on the reader or writer. By default, disabled. * Add `Ext_empty_paragraphs` constructor to `Extension`. * Revert "Docx reader: don't strip out empty paragraphs." This reverts commit d6c58eb836f033a48955796de4d9ffb3b30e297b. * Implement `empty_paragraphs` extension in docx reader and writer, opendocument writer, html reader and writer. * Add tests for `empty_paragraphs` extension.
2017-11-25	Fix comment typo: s/elemnet/element/	Alexander Krotov	1	-1/+1

2017-11-18	HTML reader: ensure we don't produce level 0 headers,	John MacFarlane	1	-5/+5
	even for chapter sections in epubs. This causes problems because writers aren't set up to expect these. This fixes the most immediate problem in #4076. It would be good to think more about how to propagate the information that top-level headers are chapters from the reader to the writer.
2017-11-10	HTML reader: hlint	Alexander Krotov	1	-31/+30

2017-11-01	Really fix #3989.	John MacFarlane	1	-5/+12
	The previous fix only worked in certain cases. Other cases with `>` in an HTML attribute broke.
2017-11-01	hlint	Alexander Krotov	1	-5/+5

2017-10-31	Fixed regression in parsing of HTML comments in markdown...	John MacFarlane	1	-2/+3
	and other non-HTML formats (`Text.Pandoc.Readers.HTML.htmlTag`). The parser stopped at the first `>` character, even if it wasn't the end of the comment. Closes #4019.
2017-10-29	Source code reformatting.	John MacFarlane	1	-61/+63

2017-10-27	Consistent underline for Readers (#2270)	hftf	1	-1/+5
	* Added underlineSpan builder function. This can be easily updated if needed. The purpose is for Readers to transform underlines consistently. * Docx Reader: Use underlineSpan and update test * Org Reader: Use underlineSpan and add test * Textile Reader: Use underlineSpan and add test case * Txt2Tags Reader: Use underlineSpan and update test * HTML Reader: Use underlineSpan and add test case
2017-10-24	HTML reader: close td/th should close any open block tag...	John MacFarlane	1	-0/+2
	Closes #3991.
2017-10-24	HTML reader: td should close an open th or td.	John MacFarlane	1	-0/+1