diff options
author | John Luke Bentley <johnny_bentley@yahoo.com.au> | 2017-03-04 20:08:38 +1100 |
---|---|---|
committer | John MacFarlane <jgm@berkeley.edu> | 2017-03-04 10:08:38 +0100 |
commit | 07d51d9e30985c2d71b4341482982f81f206f9c4 (patch) | |
tree | 37fd1e6721e0913e9aeeea292450fa65eb554282 /test/lhs-test.html | |
parent | ce9d49ef0421f23fdc05fabd8d7f754d680bea47 (diff) | |
download | pandoc-07d51d9e30985c2d71b4341482982f81f206f9c4.tar.gz |
Make default.html5 polyglot markup conformant. (#3473)
Polyglot markup is HTML5 that is also valid XHTML. See
<https://www.w3.org/TR/html-polyglot>. With this change, pandoc's
html5 writer creates HTML that is both valid HTML5 and valid XHTML.
See jgm/pandoc-templates#237 for prior discussion.
* Add xml namespace to `<html>` element.
* Make all `<meta>` elements self closing.
See <https://www.w3.org/TR/html-polyglot/#empty-elements>.
* Add `xml:lang` attribute on `<html>` element, defaulting to blank, and
always include `lang` attribute, even when blank. See
<https://www.w3.org/TR/html-polyglot/#language-attributes>.
* Update test files for template changes.
The key justification for having language values default to blank: it
turns out the HTML5 spec requires it (as I read it). Under
[the HTML5 spec, section "3.2.5.3. The lang and xml:lang
attributes"](https://www.w3.org/TR/html/dom.html#the-lang-and-xmllang-attributes),
providing attributes with blank contents both:
* Has meaning, "unknown", and
* Is a MUST (written as "must") if a language value is not provided ...
> The lang attribute (in no namespace) specifies the primary language
> for the element's contents and for any of the element's attributes that
> contain text. Its value must be a valid BCP 47 language tag, or the
> empty string. Setting the attribute to the empty string indicates that
> the primary language is unknown.
In short, it seems that where a language value is not provided then a
blank value MUST be provided for Polyglot Markup conformance, because
the HTML5 spec stipulates a "must". So although the Polyglot Markup spec
is unclear on this issue it would seem that if it was correctly written,
it would therefore require blank attributes.
Further justifications are found at
https://github.com/jgm/pandoc-templates/issues/237#issuecomment-275584181
(but the HTML5 spec justification given above would seem to be the
clincher).
In addition to having lang-values-default-to-blank I recommend that, when an
author does not provide a lang value, then upon on pandoc command execution
a warning message like the following be provided:
> Polyglot markup stipulates that 'The root element SHOULD always specify
> the language'. It is therefore recommended you specify a language value in
> your source document. See
> <https://www.w3.org/International/articles/language-tags/> for valid
> language values.
Diffstat (limited to 'test/lhs-test.html')
-rw-r--r-- | test/lhs-test.html | 8 |
1 files changed, 4 insertions, 4 deletions
diff --git a/test/lhs-test.html b/test/lhs-test.html index 2c3b6b0f8..759b0955b 100644 --- a/test/lhs-test.html +++ b/test/lhs-test.html @@ -1,9 +1,9 @@ <!DOCTYPE html> -<html> +<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang=""> <head> - <meta charset="utf-8"> - <meta name="generator" content="pandoc"> - <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes"> + <meta charset="utf-8" /> + <meta name="generator" content="pandoc" /> + <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" /> <title></title> <style type="text/css">code{white-space: pre;}</style> <style type="text/css"> |