This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11737 - Change content model of <hgroup> to single <hx> + zero or more <sh> (sub/suphead) elements
Summary: Change content model of <hgroup> to single <hx> + zero or more <sh> (sub/suph...
Status: CLOSED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: All All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on: 11731
Blocks:
  Show dependency treegraph
 
Reported: 2011-01-11 18:56 UTC by Leif Halvard Silli
Modified: 2011-08-04 05:14 UTC (History)
13 users (show)

See Also:


Attachments

Description Leif Halvard Silli 2011-01-11 18:56:40 UTC
Instead of 
<hgroup>
    <h1>Main Heading</h1>
    <h2>Subheading</h2>
</hgroup>

we should adopt

<hgroup>
    <h1>Main Heading </h1>
    <sh>Subheading</sh>
</hgroup>

Thus, we should say that <hgroup>  can take a single <h[1-6]> element plus zero or more <sh> elements. (Or <sheading> or <subhead>, if that would better names than <sh>.)

For an outline of the proposal's advantages, see:
http://lists.w3.org/Archives/Public/public-html/2011Jan/0093
http://lists.w3.org/Archives/Public/public-html/2011Jan/0094

Summary
* It solves the main problem of the current content model - that some headings in a hgroup no longer have h1-h6 semantics (except that they do, in legacy UAs) - this is hard to grasp and hard to "calculate" the effects of.

* It is easier to style with CSS. It doesn't require the same detailed view of the markup structure as the current content model does.

* It is semantically less confusing, as hgroup with this content model only contains a single hx.

* It has better semantical fallback:  Both a legacy UA and a HTML5 UA would create the same outline. In addition, a HTML5 user agent would EITHER see the whole thing as a heading (if we define it like that) OR it would see the hgroup as solely a container for grouping <sh> together with <hx>.

* It it makes the outline algorithm easier to understand and implement. At the same time, the current outline - as described in books and implemented here and there - would continue to work, except when the highest rank heading follows after lower rank heading(s).

* It seems harder to use incorrectly. The only obvious mistake that one could make would be to use <sh> outside a <hgroup>,  which would be relatively harmless (semantics of <div>).

Disadvantages:

* Doesn't support the use case of multiple subheadings of different weights. This seems like a very minor use case that could be addressed in the future if it is significant.

* In legacy UAs the <sh> would count as a inline element. This is easy to fix via CSS. But wiithout that fix, then, if the first element after <hroup> is an inline element, then the two elements would grow together, if <sh> is the last child of <hgroup>. See this test in for example Firefox: http://software.hixie.ch/utilities/js/live-dom-viewer/saved/771
But if hgroup is mostly already "fixed" as as block element via CSS, anyway, then it would not matter anyhow.
Comment 1 Ian 'Hixie' Hickson 2011-02-14 22:56:50 UTC
> * It solves the main problem of the current content model - that some headings
> in a hgroup no longer have h1-h6 semantics (except that they do, in legacy UAs)
> - this is hard to grasp and hard to "calculate" the effects of.

I don't understand this point. Can you elaborate?


> * It is easier to style with CSS. It doesn't require the same detailed view of
> the markup structure as the current content model does.

The hgroup styling is hardly burdensome:

   hgroup > h1 { ... }
   hgroup > h2 { ... }

The proposal above doesn't really improve it, either  I don't think this is any more intuitive:

   hgroup > h1 { ... }
   sh { ... }


> * It is semantically less confusing, as hgroup with this content model only
> contains a single hx.

I don't understand why having multiple levels of headings is semantically confusing. It seems pretty semantically clear to me.


> * It has better semantical fallback:  Both a legacy UA and a HTML5 UA would
> create the same outline. In addition, a HTML5 user agent would EITHER see the
> whole thing as a heading (if we define it like that) OR it would see the hgroup
> as solely a container for grouping <sh> together with <hx>.

Not sure what practical effects this would have. What concrete software or users would be affected by this "semantical fallback"?


> * It it makes the outline algorithm easier to understand and implement. At the
> same time, the current outline - as described in books and implemented here and
> there - would continue to work, except when the highest rank heading follows
> after lower rank heading(s).

The outline algorithm's complexity is not due to <hgroup>. I don't think this proposal materially affects the complexity of the algorithm. Indeed <hgroup> is not explicitly mentioned in the outline algorithm currently at all.


> * It seems harder to use incorrectly. The only obvious mistake that one could
> make would be to use <sh> outside a <hgroup>,  which would be relatively
> harmless (semantics of <div>).

I don't understand why that is harder than misusing <hgroup>. What failure scenarios with <hgroup> are you expecting to commonly see?


Problems with this proposal:
* It doesn't degrade gracefully in legacy UAs.
* It doesn't pave existing cowpaths (which use <h1>/<h2>  see e.g. the HTML4 spec).
* It only supports one level of subheading.
Comment 2 Leif Halvard Silli 2011-02-15 06:52:49 UTC
(In reply to comment #1)
> > * It solves the main problem of the current content model - that some headings
> > in a hgroup no longer have h1-h6 semantics (except that they do, in legacy UAs)
> > - this is hard to grasp and hard to "calculate" the effects of.
> 
> I don't understand this point. Can you elaborate?

In a HTML5 UA, <hgroup> has the "heading value" of the child with the highest heading value.
Wheras in a HTML4 UA, <hgroup> would have no "heading value" whereas each hn child would have a heading value.

By saying that only one hn element is permitted, the HTML4 UA and the HTML5 UA would get a much more identical effect - they would both see only one header.

(The disadvantage would be that HTML4 UAs would not see any heading value for the sh elemetns.)

> > * It is easier to style with CSS. It doesn't require the same detailed view of
> > the markup structure as the current content model does.
> 
> The hgroup styling is hardly burdensome:
> 
>    hgroup > h1 { ... }
>    hgroup > h2 { ... }

I would assume that if the <h2> follows a <h1> inside a <hgroup>, then the h2 would be styled differently from when the <h2> appears outside hgroup, no? Whereas the selector you demonstrate there treat both cases the same. If you treat both cases the same, then it seems to me that you woudl treat a <h2> inside <hgroup> the same as a <h2> outside <hgroup>. But in that case, you would probably not need 'hgroup' inside the selector! 

> The proposal above doesn't really improve it, either - I don't think this is
> any more intuitive:
> 
>    hgroup > h1 { ... }
>    sh { ... }

The simplification is that I can do 
hgroup h1+sh { }
hgroup h2+sh { }

of courrse, one can also do for example
hgroup h1+h1 { }
hgroup h1+h2 { }
hgroup h1+h3 { }
hgroup h1+h4 { }
hgroup h1+h5 { }
hgroup h1+h6 { }
hgroup h2+h1 { }
hgroup h2+h2 { }
hgroup h2+h3 { }
hgroup h2+h4 { }
hgroup h2+h5 { }
hgroup h2+h6 { }

however, as you can see, this creates many more options - for what the element after the h1 element is. It thus creates many options that the author doesn't need and which only needlessly complicate things. if a detailed differentiation is needed, then the author can use class names.

The sh element also allows you to do this - provided that sh is not the first element:
hgroup h1:first-child+sh {}

as well as
h1+sh {}

So, no, to me 'sh' is really much simpler because it creates fewere options.

> > * It is semantically less confusing, as hgroup with this content model only
> > contains a single hx.
> 
> I don't understand why having multiple levels of headings is semantically
> confusing. It seems pretty semantically clear to me.

The 'sh' is also a heading - but it is a sub/super heading. So then we are on the same page in that detail. The confusing that I have in mind is related to the outline effect.

> > * It has better semantical fallback:  Both a legacy UA and a HTML5 UA would
> > create the same outline. In addition, a HTML5 user agent would EITHER see the
> > whole thing as a heading (if we define it like that) OR it would see the hgroup
> > as solely a container for grouping <sh> together with <hx>.
> 
> Not sure what practical effects this would have. What concrete software or
> users would be affected by this "semantical fallback"?

These is the concrete, affected applications I am aware of - on the top of my head:

* iCab has a feature whereby it creates an outline of the header on the page. 
* Amaya has a feature whereby it generates ToC based on the heading element of the document one has authored
* PrinceXML generates outlines in PDF based on the headerers it finds in the document. (This outline is presented in the sidebar of the PDF document in the Mac OS X Preview.app and - probably - in a similar way in Adobe.) This is a featuere that is independent of ToC's that the author creates manually. 

I also believe that AT are affected in similar ways, but I have no concrete test results.

> > * It it makes the outline algorithm easier to understand and implement. At the
> > same time, the current outline - as described in books and implemented here and
> > there - would continue to work, except when the highest rank heading follows
> > after lower rank heading(s).
> 
> The outline algorithm's complexity is not due to <hgroup>. I don't think this
> proposal materially affects the complexity of the algorithm. Indeed <hgroup> is
> not explicitly mentioned in the outline algorithm currently at all.

You might be correct - I would have to recheck the algorithm. But as I assume you are entirely correct, then then my only point, in this regard, would be that it is simpler to assume that all hn elements should be inserted into the outline, than it is to assuem and calcluate with the fact that there are some exceptions, namely those hn elements thare are not the first and higheest of rank inside a hgroup element.

> > * It seems harder to use incorrectly. The only obvious mistake that one could
> > make would be to use <sh> outside a <hgroup>,  which would be relatively
> > harmless (semantics of <div>).
> 
> I don't understand why that is harder than misusing <hgroup>. What failure
> scenarios with <hgroup> are you expecting to commonly see?

By harder to use incorrectly, I meant that it is harder to misuse a "sh" outside a <hgroup> than it would be to use a hn outiside a hgroup incorrectly. (I was partly copying an argumetn put forward by James - I think --, so it doesn't entirely fit.)

If a sh by mistake lands below the hgroup or if the author fails to use hgroup at all, then the outline still becomes the same as if he/she had wrapped the sh inside the hgroup. Thus, the only advantage of adding wrapping the hgroup around a "loose" <sh>, would that it brings "headerness" to the 'sh' element.  Outside the hgroup, it has no headerness.

To your defences, I would then say that, precisely because a "loose" h1-h6 element needs to be inside the hgroup in order to loose its "outline-ness", then this is an argumetn in favour of the current solution: In a HTML5 aware user agent, the moving of the hn element inside the hgroup may give the author visual feedback that he/she is removing the outline-ness.

To my defence: one would get the same effect by using <sh>. But, again, the disadvantage would be that this would happen even if I did not move the <sh> inside <hgroup>.  I can see some advantages to the current solution in that regard. The only way to overcoeme this, would be through saying that 'sh' *is* included in the outline - unless it is kept inside <hgroup>. Could that be an option?

> Problems with this proposal:
> * It doesn't degrade gracefully in legacy UAs.
> * It doesn't pave existing cowpaths (which use <h1>/<h2> - see e.g. the HTML4
> spec).
> * It only supports one level of subheading.

Replies about the problems you see:

* Strictly speaking, since a sh could be used both before and after the hn element, it would be possible to discern both a "before subheader" and an "after subheader".
* But otherwise: Yes, it only supports one level of subheading. However, I consider the  internal fine-grained sub-heading levels as mostly a visuall formatting  thing. Can it be proven that it is otherwise?

* Regarding the cowpath of the HTML4 spec: Is it certain that 
    <h2>W3C Recommendation 24 December 1999</h2>
   does not belong in the outline? If there is a problem with that, then why didn't they use a <p> element instead?
* a HTML5 parser will not jump over that h2 element when creating the outline either. So I don't quite see the cowpath.

* Is it relevant to consider that _specs_ creats cowpaths? I have the feeling that you, in general, find that specs are not "in the wild" examples.

* I don't understand your "degrade gracefully in legacy UA"  point of view. It seems to me that whether this is true depends on whether it is most important to preserve a "headerness" - which nevertheless does not cause an outline. Or if it is most important to create a useful outline.
* I believe that, to the extent that current tools and browsers provides outlines  - and/or auto-generated table-of-contents - then authors have adapted themselves to this. I, for one, would not have followed the HTML4 "cowpath" if I did not want the H2 element to take part in the table-of-content/outline. 
* Therefore, this socalled cowpath, if it exists outside W3 specifications, seems - eventually - more a result of a focus on the visual results than on the semantics.  And in the cases when it is not about visual effect, then I bet that they *did* want the outline that this gave them.
* I would of course not claim that HTML4 or XML1.0 was authored by people who cared for the visual results more than the semantic results. And there are of course two options: the HTML4 editor could have been very keen on saying that, <imaginary-quote> the  'W3C Recommendation 24 December 1999' *is*  a header, despite that I am unabl eto remove it from the outline</imaginary-quote>. The other option is that the  HTML4 spec editor were entirely satisfied with both the outline and the headerness. But what do we know? It is pure speculation to me.

I think it is correct to assume that one would want to have subheadings, and for that reason - certainly - it must also be correct to assume that many occurrences of 
    <h1>Line 1</h1><h2>Line 2</h2>
could very well have been meant as a
    <hgroup><h1>Line 1</h1><h2>Line 2</h2></hgroup>
But that, eventually, to me becomes a cowpath for the very subheading concept, and not a cowpath for that exact way to construct subheadings.

I am not unable to see that the current content model of hgroup has some advantages. For instance, it introduces only one element - hgroup - instead of two  - hgroup and sh. And, if you want to change a h2 into a subheader, you just wrap a hgroup around it, instead of turning it into another element first.

 But when evaluating the legacy UA compatibility issue, I believe that it is more important to consider the outline effect than it is consider the header effect. And if I am correct in that way of looking at it, then we must look away from such examples as the HTML4 specification: they don't tell us anything, and exactly teh HTMl4 spec will nevertheless never be affected by our decisions, since the HTML4 spec was set in stone 12-13 years ago. We can assume that the authors of "the next spec" will want to use hgroup, but we cannot know whether the editor simply want to use a new feature or if he/she truly did not  - and neved did - intend to include 'W3C Recommendation day-month-year" in the outline.
Comment 3 Leif Halvard Silli 2011-02-15 07:02:19 UTC
(In reply to comment #2)
> (In reply to comment #1)

>In a HTML5 UA, <hgroup> has the "heading value" of the child with the highest
>heading value.
>Wheras in a HTML4 UA, <hgroup> would have no "heading value" whereas each hn
>child would have a heading value.

Please read teh above like so:

In a HTML5-UA, <hgroup> has the "outline value" of the child with the highest heading value, and the same "heading value", except that it would be sudivied in different internal levels.
Wheras in a HTML4 UA, <hgroup> would have no "heading value" whereas each hn
child would have a heading value. 

FURTHER more: if hgroup can contain a single hn plus one or more sh, then legacy UAs and HTML5 UAs will - largely - agree about the outline result (provided that HTML5's <section> and <article> are not used). They will also _largelly_ agree about the headings they see. WIth the difference that the HTML5-UA would see the sh elements as subheadings to the highest level heading of headergroup element, while legacy-UAs will only see them as close to the header and thus - at least if they come after the hn element, see them as linked to that header, without themselves being headers in any way.
Comment 4 Ian 'Hixie' Hickson 2011-05-03 19:37:39 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: There's no <hgroup> in the W3C spec anymore.
Comment 5 Michael[tm] Smith 2011-08-04 05:14:26 UTC
mass-move component to LC1