Re: ISSUE-55: Re-enable @profile in HTML5 (draft 1) from Smylers on 2009-10-10 (public-html@w3.org from October 2009)

From: Smylers <Smylers@stripey.com>
Date: Sat, 10 Oct 2009 22:03:37 +0100
To: Manu Sporny <msporny@digitalbazaar.com>, HTMLWG WG <public-html@w3.org>
Message-ID: <20091010210337.GC31583@stripey.com>
Manu Sporny writes:

> Jonas Sicking wrote:
> 
> > On Tue, Sep 29, 2009 at 12:33 AM, Toby Inkster <tai@g5n.co.uk> wrote:
> > 
> > > On Mon, 2009-09-28 at 17:32 -0700, Jonas Sicking wrote:
> > > 
> > > > I would personally recommend that RDFa follow the strategy that
> > > > HTML uses
> > >
> > > To only provide version identifiers for the first four versions?
> > 
> > To never break backwards compatibility with existing content. The
> > "version identifiers" in earlier versions of HTML were never, to my
> > knowledge, used as a way to break compatibility with older versions of
> > the specification.
> 
> At present, your statement that HTML5 will "never break backwards
> compatibility with existing content" is not true. HTML5 breaks
> backward compatibility in many (good) ways... it obsoletes a number of
> HTML features:

Yes, but that doesn't affect backwards compatibility; it merely tells
authors not to use these features.

> after reading Section 12.2, if I were to write a User Agent - I expect
> that I wouldn't have to support any of those features.

Your expectation is wrong: document conformance affects authors, but
HTML5 goes to great lengths to define user agent behaviour for all sorts
of things authors do which isn't conforming -- including both things
which have never been conforming, and things which were conforming to
previous versions of the HTML specification.

So things which currently 'works' on the web will continue to work with
an HTML5 user agent.

> For example, if my web browser saw <center> , it would ignore it
> completely and be fully conformant with the HTML5 specification, IIRC.

It could ignore it, and it would be conformant, but not for the reason
you think.  _All_ the rendering information in HTML5 is given as
'expectations', which user agents are free to ignore.  The rendering
expectations for <center> are:

  The center element ... [is] expected to center text within [itself],
  as if [it] had [its] 'text-align' property set to 'center' in a
  presentational hint, and to align descendants to the center.

Changing <center> to being conforming would have no effect on the above
expectations, and no effect on whether an HTML5 user agent is required
to centre its contents.

> Since most HTML4 documents are now magically HTML5 documents, how does
> this not break backward compatibility in this particular scenario? In
> an HTML5 conformant browser, suddenly, text will no longer be required
> to be centered.

A graphical user agent that follows the expectations in the rendering
section will still centre the text.  A user agent (typically one which
isn't graphical) which chooses to ignore the rendering expectations
would likely be ignoring all of them, and not attempting to give a view
of webpages such as many users are familiar with in mainstream browsers;
a browser which chose to ignore the expecting rendering would be
unlikely to gain significant mainstream usage.

Regardless of the optionality of the entire rendering section, HTML5
makes it _possible_ for a conforming user agent to be backwards
compatible.  (Or if you'd prefer, the entire rendering section could be
changed to be MUST requirements for browsers, and HTML5 as a whole would
still be consistent and possible.)  That would not be the case if HTML5
defined <center> as having different behaviour from in HTML4, or even if
HTML5 demanded that <center> be ignored and not centre its contents.

> How do we allow for deep changes to a language in the future, but
> ensure backwards compatibility for those that want to ensure their
> document is processed in the same way in the future?

HTML5 side-steps that issue by carefully not making
backwards-incompatible changes.

> Somebody that is far more knowledgeable and astute than I am (I'm
> looking at you Dan Connolly) about the history of HTML and what
> compatibility was or was not broken from version to version will have
> to enlighten us. Specifically, on whether your assertion that
> pre-HTML5 hasn't broken backwards compatibility and depended on the
> @version attribute to differentiate between different versions of the
> spec, is true.

Whether HTML has depended on the version attribute is a matter for
browser developers, not spec writers.  Jonas is a Mozilla developer.  If
he says that Mozilla doesn't look at version numbers in order to process
'HTML4 pages' differently from 'HTML3 pages' (etc) then he knows what
he's talking about.

> So,
> 
>  * HTML5 breaks backwards compatibility for several good reasons.

It doesn't.

>  * There is currently no way for an author to specify that they would
>    like their documents to be processed as HTML5 instead of HTML6.

That's true, but then HTML6 doesn't exist yet.  HTML6 _may_ be developed
with the same goals as HTML5, and as such retain backwards compatibility
such that processing an HTML5 document with an HTML6 user agent will
yield exactly the same behaviour and output as doing so with an HTML5
user agent.  In which case no version specifier is needed.

Or HTML6 may make so many incompatible changes, along XHTML2 lines, that
it fails to gain significant market share on the web or in mainstream
browsers, in which case no version specifier is needed.

If it turns out HTML6 documents _do_ need distinguishing from HTML5
documents, then those creating HTML6 are going to be better placed to
work out the best way of doing that than we are now.  Perhaps they'll
add a version attribute.  But retaining backwards compatibility with the
current web will require that documents without that attibute (or
whatever) be processed according to HTML5 rules.  So it'll be as easy to
add the attribute then as now, meaning doing it now has no advantage for
HTML6.

>  * There is currently no way for an author to specify that their
>    document uses a number of extended processing behaviors built on
>    top of HTML5.

The author specifies them merely by using them.  If an author wishes to
use MathML in an HTML5 document then she just uses it; if an HTML5
browser which supports MathML encounters some MathML then it will
process it.

Having the author specify at in a document's <head> that it contains
MathML does not in practice offer any advantages:

* If a user agent chooses not to support MathML then it isn't going to
  be able to display the MathML regardless of whether it's called out in
  the <head> or not.

* If a user agent does support MathML then it would be unhelpful for it
  to decline to process MathML only if it has been declared in the
  <head>.  Again, the <head> makes no difference to it.

* If the use of MathML in a webpage needs to be flagged in the <head>
  that makes life awkward for users of content management systems or
  people sharing snippets of HTML, who often have no control over the
  surrounding page HTML their content ends up in.

>  * There is currently no way for an author to specify that their
>    document should be processed via extended processing behavior
>    using FeatureX version 1.0 instead of FeatureX version 2.0.

True.  But possibly the FeatureX 2.0 spec could define that, rather than
there needing to be a general HTML mechanism for it.  Given how
undesirable backwards incompatibility is, HTML5 should not be
encouraging it or making it easy.

> As I've stated previously, this is a technical issue and is directly
> relevant to XMLLiteral generation in RDFa 1.0 vs. RDFa 1.1.

Manu Sporny writes:

> Henri Sivonen wrote:
>
> > [Somebody whose attribution has previously been snipped wrote:]
> > 
> > > we don't need to indicate a version until there is some different
> > > processing to do -- such as a difference between a version 1.1 and
> > > 1.0.
> > 
> > But if you do need a version indicator for 1.1, wouldn't the problem of
> > not completely controlling the page arise again? How is a version
> > indicator for 1.1 supposed to work in Planet syndication?
> 
> You wouldn't /need/ a version indicator for 1.1. @version isn't a
> MUST... it is a SHOULD. If the @version indicator wasn't specified on
> the page, but the User Agent still decided to extract RDFa from the
> page, then the latest RDFa processor rules known to the User Agent
> should be used. So, if RDFa 1.1 were the latest version... the RDFa 1.1
> rules would be used to extract the data from the page.

That breaks backwards compatibility: since the version indicator isn't
required, there will be existing version 1.0 content which isn't
labelled as such; then a version 1.1 user agent is released, which
treats unlabelled content as version 1.1, and the existing
1.0-but-unlabelled content suddent starts being treated differently.

Retaining backwards compatibility requires that _all_ version 1.0
content continues to be treated as it was in version 1.0 in later user
agents.

So if a version 1.1 user agent is to process version 1.0 content
differently from version 1.1 content then at least one group of content
MUST be labelled in some way; you can't have both versions being
unlabelled (even optionally).

You _can_ retain backwards compatibility if you trigger the version 1.1
processing by encountering a version 1.1 label -- that is, an author
MUST label version 1.1 content as such, with everything else being
treated as 1.0.  But that means you can't have the 1.1 label as being
merely a SHOULD requirement.

Which is a problem for syndication and the like if the version label is
in the document <head>, because in the general case you can't control
that.

So you're going to get more success if the version label is directly on
the mark-up in question, for example by using "rdf2" rather than "rdf"
to signal it.  In which case HTML doesn't need any versioning stuff in
the <head>.

Smylers
Received on Saturday, 10 October 2009 21:04:07 UTC