aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorJohn MacFarlane <jgm@berkeley.edu>2021-03-02 14:23:02 -0800
committerJohn MacFarlane <jgm@berkeley.edu>2021-03-02 14:23:02 -0800
commit95b1327f6bdefd63b3a9241e2924bdf7bffe0156 (patch)
tree3055f16ae21fa6d3d7c78eba6897a41fa040c5ff
parentaf9f6c00aad5fe1e42032a67f397cc9ce75204db (diff)
downloadpandoc-95b1327f6bdefd63b3a9241e2924bdf7bffe0156.tar.gz
Update changelog.
-rw-r--r--changelog.md280
1 files changed, 182 insertions, 98 deletions
diff --git a/changelog.md b/changelog.md
index b13bf1fd4..a48b668b5 100644
--- a/changelog.md
+++ b/changelog.md
@@ -2,6 +2,11 @@
## pandoc 2.12 (UNRELEASED -- PROVISIONAL)
+ * When downloading content from URL arguments, be sensitive to
+ the character encoding (#5600). We can properly handle UTF-8 and latin1
+ (ISO-8859-1); for others we raise an error. Fall back to latin1 if
+ no charset is given in the mime type and UTF-8 decoding fails.
+
* Add new unexported module Text.Pandoc.XML.Light, as well
as Text.Pandoc.XML.Light.Types, Text.Pantoc.XML.Light.Proc,
Text.Pandoc.XML.Light.Output. (Closes #6001, #6565, #7091).
@@ -34,75 +39,6 @@
these have been added. And the docx golden tests have been updated,
because the new parser does not preserve the order of attributes.
- * Text.Pandoc.App
-
- + Add `parseOptionsFromArgs` [API change, new exported function].
-
- * Text.Pandoc.Citeproc.BibTeX
-
- + `Text.Pandoc.Citeproc.writeBibTeXString` now returns
- `Doc Text` instead of `Text` (#7068).
- + Correctly handle `pages` (= `page` in CSL) (#7067).
- + Correctly handle BibLaTeX `langid` (= `language` in CSL, #7067).
- + In BibTeX output, protect foreign titles since there's no language
- field (#7067).
- + Clean up BibTeX parsing (#7049). Previously there was a messy code
- path that gave strange results in some cases, not passing through raw
- tex but trying to extract a string content. This was an artefact of
- trying to handle some special bibtex-specific commands in the BibTeX
- reader. Now we just handle these in the LaTeX reader and simplify
- parsing in the BibTeX reader. This does mean that more raw tex will
- be passed through (and currently this is not sensitive to the
- `raw_tex` extension; this should be fixed).
-
- * Text.Pandoc.Citeproc.MetaValue
-
- + Correctly parse "raw" date value in markdown references metadata.
- (See jgm/citeproc#53.)
-
- * Text.Pandoc.Class
-
- + Add `getTimestamp` [API change]. This attempts to read the
- `SOURCE_DATE_EPOCH` environment variable and parse a UTC time
- from it (treating it as a unix date stamp, see
- https://reproducible-builds.org/specs/source-date-epoch/). If the
- variable is not set or can't be parsed as a unix date stamp, then the
- function returns the current date.
-
- * Text.Pandoc.Error
-
- + Remove unused variables (Albert Krewinkel)
- + Export `renderError` [API change].
- + Refactor `handleError` to use `renderError`. This allows us render
- error messages without exiting.
-
- * Text.Pandoc.Extensions
-
- + `Ext_task_lists` is now supported by org (and turned
- on by default) (Albert Krewinkel, #6336).
- + Remove `Ext_fenced_code_attributes` from allowed commonmark attributes
- (#7097). This attribute was listed as allowed, but it didn't actually
- do anything. Use `attributes` for code attributes and more.
-
- * Lua subsystem:
-
- + Always load built-in Lua scripts from default data-dir (Albert
- Krewinkel). The Lua modules `pandoc` and `pandoc.List` are now always
- loaded from the system's default data directory. Loading from a
- different directory by overriding the default path, e.g. via
- `--data-dir`, is no longer supported to avoid unexpected behavior
- and to address security concerns.
- + Add module "pandoc.path" (Albert Krewinkel, #6001, #6565).
- The module allows to work with file paths in a convenient and
- platform-independent manner.
-
- * Text.Pandoc.PDF
-
- + Disable `smart` extension when building PDF via LaTeX.
- This is to prevent accidental creation of ligatures like
- `` ?` `` and `` !` `` (especially in languages with quotations like
- German), and similar ligature issues. (See jgm/citeproc#54.)
-
* DocBook reader:
+ Avoid expensive tree normalization step, as it is not necessary
@@ -137,6 +73,11 @@
* LaTeX reader:
+ + Don't export `tokenize`, `untokenize` [API change]. These are internal
+ implementation details, which were only exported for testing.
+ They don't belong in the public API.
+ + Improved efficiency of the parser. With these changes the reader
+ is almost twice as fast as in the last release in our benchmarks.
+ Code cleanup, removing some unnecessary things.
+ Rewrite `withRaw` so it doesn't rely on fragile assumptions
about token positions (which break when macros are expanded)
@@ -156,11 +97,19 @@
it doesn't change anything.
+ Improve `braced'`. Remove the parameter, have it parse the
opening brace, and make it more efficient.
+ + Factor out pieces of the LaTeX reader to make the module smaller.
+ This reduces memory demands when compiling. Created
+ Text.Pandoc.Readers.{LaTeX,Math,Citation,Table,Macro,Accent}.
+ Changed Text.Pandoc.Readers.LaTeX.SIunitx to export a command map
+ instead of individual commands.
+
+ * Make Text.Pandoc.Readers.LaTeX.Types an unexported module [API change].
* Markdown reader:
+ Improved handling of mmd link attributes in references (#7080).
Previously they only worked for links that had titles.
+ + Improved efficiency of the parser (benchmarks show a 15% speedup).
* OPML reader:
@@ -218,7 +167,7 @@
+ Use `getTimestamp` instead of `getCurrentTime` for timestamp.
Setting `SOURCE_DATE_EPOCH` will allow reproducible builds.
- * Text.Pandoc.Writers.EPUB
+ * EPUB writer:
+ Use `getTimestamp` instead of `getCurrentTime` for timestamp.
Setting `SOURCE_DATE_EPOCH` will allow reproducible builds (#7093).
@@ -233,6 +182,10 @@
contains an ampersand or another character with a special meaning
in XML.
+ * Jira writer:
+
+ + Use Span identifiers as anchors (Albert Krewinkel,
+
* LaTeX writer:
+ Adjust hypertargets to beginnings of paragraphs (#7078).
@@ -241,6 +194,8 @@
This makes a particular difference for links to citations using
`--citeproc` and `link-citations: true`.
+ Change BCP47 lang tag from `jp` to `ja` (Mauro Bieg, #7047).
+ + Use function instead of map for accent lookup (should be
+ more efficient).
* Markdown writer:
@@ -279,40 +234,151 @@
`--include-in-header`, `--include-before-body`, `--include-after-body`)
may be used.
- * LaTeX template: Update to iftex package (#7073) (Andrew Dunning)
+ * LaTeX template:
+
+ + Update to iftex package (#7073) (Andrew Dunning)
+ + Wrap url colours in braces (#7121) (Loïc Grobol).
* revealjs template: Add 'center' option for vertical slide centering.
(maurerle, #7104).
* Text.Pandoc.XML: Improve efficiency of `fromEntities`.
- * Test suite: a more robust way of testing the executable.
- Many of our tests require running the pandoc executable. This is
- problematic for a few different reasons. First, cabal-install will
- sometimes run the test suite after building the library but before
- building the executable, which means the executable isn't in place for
- the tests. One can work around that by first building, then building and
- running the tests, but that's fragile. Second, we have to find the
- executable. So far, we've done that using a function `findPandoc` that
- attempts to locate it relative to the test executable (which can be
- located using findExecutablePath). But the logic here is delicate and
- work with every combination of options. To solve both problems, we add
- an `--emulate` option to the `test-pandoc` executable. When `--emulate`
- occurs as the first argument passed to `test-pandoc`, the program simply
- emulates the regular pandoc executable, using the rest of the arguments
- (after `--emulate`). Thus, `test-pandoc --emulate -f markdown -t latex`
- is just like `pandoc -f markdown -t latex`.
- Since all the work is done by library functions, implementing this
- emulation just takes a couple lines of code and should be entirely
- reliable. With this change, we can test the pandoc executable by running
- the test program itself (locatable using `findExecutablePath`) with the
- `--emulate` option. This removes the need for the fragile `findPandoc`
- step, and it means we can run our integration tests even when we're just
- building the library, not the executable. [Note: part of this change
- involved simplifying some complex handling to set environment variables
- for dynamic library paths. I have tested a build with
- `--enable-dynamic-executable`, and it works, but further testing may be
- needed.]
+ * Text.Pandoc.MIME
+
+ + Add exported function `getCharset` [API change].
+
+ * Text.Pandoc.UTF8: change IO functions to return Text, not String
+ [API change]. This affects `readFile`, `getContents`, `writeFileWith`,
+ `writeFile`, `putStrWith`, `putStr`, `putStrLnWith`, `putStrLn`.
+ `hPutStrWith`, `hPutStr`, `hPutStrLnWith`, `hPutStrLn`, `hGetContents`.
+ This avoids the need to uselessly create a linked list of characters
+ when emiting output.
+
+ * Text.Pandoc.App
+
+ + Add `parseOptionsFromArgs` [API change, new exported function].
+
+ * Text.Pandoc.Citeproc.BibTeX
+
+ + `Text.Pandoc.Citeproc.writeBibTeXString` now returns
+ `Doc Text` instead of `Text` (#7068).
+ + Correctly handle `pages` (= `page` in CSL) (#7067).
+ + Correctly handle BibLaTeX `langid` (= `language` in CSL, #7067).
+ + In BibTeX output, protect foreign titles since there's no language
+ field (#7067).
+ + Clean up BibTeX parsing (#7049). Previously there was a messy code
+ path that gave strange results in some cases, not passing through raw
+ tex but trying to extract a string content. This was an artefact of
+ trying to handle some special bibtex-specific commands in the BibTeX
+ reader. Now we just handle these in the LaTeX reader and simplify
+ parsing in the BibTeX reader. This does mean that more raw tex will
+ be passed through (and currently this is not sensitive to the
+ `raw_tex` extension; this should be fixed).
+
+ * Text.Pandoc.Citeproc.MetaValue
+
+ + Correctly parse "raw" date value in markdown references metadata.
+ (See jgm/citeproc#53.)
+
+ * Text.Pandoc.Citeproc
+
+ + Use https URLs for links (Salim B, #7122).
+
+ * Text.Pandoc.Class
+
+ + Add `getTimestamp` [API change]. This attempts to read the
+ `SOURCE_DATE_EPOCH` environment variable and parse a UTC time
+ from it (treating it as a unix date stamp, see
+ https://reproducible-builds.org/specs/source-date-epoch/). If the
+ variable is not set or can't be parsed as a unix date stamp, then the
+ function returns the current date.
+
+ * Text.Pandoc.Error
+
+ + Add `PandocUnsupportedCharsetError` constructor for
+ `PandocError` [API change].
+ + Export `renderError` [API change].
+ + Refactor `handleError` to use `renderError`. This allows us render
+ error messages without exiting.
+ + Remove unused variables (Albert Krewinkel)
+
+ * Text.Pandoc.Extensions
+
+ + `Ext_task_lists` is now supported by org (and turned
+ on by default) (Albert Krewinkel, #6336).
+ + Remove `Ext_fenced_code_attributes` from allowed commonmark attributes
+ (#7097). This attribute was listed as allowed, but it didn't actually
+ do anything. Use `attributes` for code attributes and more.
+
+ * Lua subsystem:
+
+ + Always load built-in Lua scripts from default data-dir (Albert
+ Krewinkel). The Lua modules `pandoc` and `pandoc.List` are now always
+ loaded from the system's default data directory. Loading from a
+ different directory by overriding the default path, e.g. via
+ `--data-dir`, is no longer supported to avoid unexpected behavior
+ and to address security concerns.
+ + Add module "pandoc.path" (Albert Krewinkel, #6001, #6565).
+ The module allows to work with file paths in a convenient and
+ platform-independent manner.
+ + Use strict evaluation when retrieving AST value from the stack
+ (Albert Krewinkel, #6674).
+
+ * Text.Pandoc.PDF
+
+ + Disable `smart` extension when building PDF via LaTeX.
+ This is to prevent accidental creation of ligatures like
+ `` ?` `` and `` !` `` (especially in languages with quotations like
+ German), and similar ligature issues. (See jgm/citeproc#54.)
+
+ * Text.Pandoc.CSV:
+
+ + Fix parsing of unquoted values (#7112). Previously we didn't allow
+ unescaped quotes in unquoted values, but they are allowed
+ in CSV.
+
+ * Test suite:
+
+ + Use a more robust method for testing the executable. Many
+ of our tests require running the pandoc executable. This
+ is problematic for a few different reasons. First,
+ cabal-install will sometimes run the test suite after
+ building the library but before building the executable,
+ which means the executable isn't in place for the tests.
+ One can work around that by first building, then building
+ and running the tests, but that's fragile. Second, we
+ have to find the executable. So far, we've done that using
+ a function `findPandoc` that attempts to locate it
+ relative to the test executable (which can be located
+ using findExecutablePath). But the logic here is delicate
+ and work with every combination of options. To solve both
+ problems, we add an `--emulate` option to the
+ `test-pandoc` executable. When `--emulate` occurs as the
+ first argument passed to `test-pandoc`, the program simply
+ emulates the regular pandoc executable, using the rest of
+ the arguments (after `--emulate`). Thus, `test-pandoc
+ --emulate -f markdown -t latex` is just like `pandoc -f
+ markdown -t latex`. Since all the work is done by library
+ functions, implementing this emulation just takes a couple
+ lines of code and should be entirely reliable. With this
+ change, we can test the pandoc executable by running the
+ test program itself (locatable using `findExecutablePath`)
+ with the `--emulate` option. This removes the need for the
+ fragile `findPandoc` step, and it means we can run our
+ integration tests even when we're just building the
+ library, not the executable. [Note: part of this change
+ involved simplifying some complex handling to set
+ environment variables for dynamic library paths. I have
+ tested a build with `--enable-dynamic-executable`, and it
+ works, but further testing may be needed.]
+ + Print accurate location if a test fails (Albert
+ Krewinkel). Ensures that tasty-hunit reports the location
+ of the failing test instead of the location of the helper
+ `test` function.
+
+ * Documentation: Update URLs and use `https` where possible (#7122,
+ Salim B).
* MANUAL.txt
@@ -328,6 +394,9 @@
* Makefile: in `make bench`, create CSV files for comparison and compare
against previous benchmark run. Add timestamp to CSV filenames.
+ * cabal.project: don't explicitly set -trypandoc.
+ If we do, this can't be overridden on the cabal command line.
+
* doc/lua-filters.md: improve documentation for
`pandoc.mediabag.insert`, `pandoc.mediabag.fetch`,
`directory`, `normalize` (Albert Krewinkel).
@@ -339,6 +408,14 @@
* Require citeproc 0.3.0.7, which correctly titlecases when titles
contain non-ASCII characters.
+ * Use skylighting 0.10.4. This version of skylighting uses xml-conduit
+ rather than hxt. This speeds up parsing of XML syntax definitions
+ fourfold, and removes four packages from pandoc's dependency graph:
+ hxt-charproperties, hxt-unicode, hxt-regex-xmlschema, hxt.
+
+ * Add script `tools/parseTimings.pl` to help pin down which
+ modules take the most time and memory to compile.
+
* Avoid unnecessary use of NoImplicitPrelude pragma (#7089) (Albert
Krewinkel)
@@ -353,6 +430,13 @@
+ Clean up benchmark code.
+ Allow specifying patterns using `-p blah'.
+ * trypandoc: add 2 second timeout.
+
+ * Use `-split-sections` in creating linux release binary.
+ This reduces executable size significantly (by about 30%).
+
+ * Remove `weigh-pandoc`. It's not really useful any more, now that our
+ regular benchmarks include data on allocation.
## pandoc 2.11.4 (2021-01-22)