Re: Rebuttal of dropHgroup CP (ISSUE-164)

On Sun, 06 Nov 2011 16:00:20 +0100, Lars Gunther <gunther@keryx.se> wrote:

> Look at the total sum of things this way:
>
> 1. We want a way to mark up subtitles.
>
> 2. Subtitles should not appear as headings in the document outline.
>
> What solution is KISS?
>
> a. Use headings and then hide them.
>
> b. Do not use headings to begin with.

There are pros and cons with both. For (b) there's the question what it  
means when the subheading element is misplaced (see below).

>
> 2011-11-06 11:13, Simon Pieters skrev:
>> Comments on http://www.w3.org/html/wg/wiki/ChangeProposals/dropHgroup
>>
>>> Hgroup alters the meaning of already implemented elements based on
>>> position.
>>
>> h1-h6 have different meaning in different places even if hgroup were to
>> be dropped. For instance, h2 in the following examples is top-level
>> heading, second-level heading and third-level heading, respectively:
>>
>> <body>
>> <h2>foo</h2>
>> </body>
>>
>> <body>
>> <h1>...</h1>
>> <h2>foo</h2>
>> </body>
>>
>> <body>
>> <h1>...</h1>
>> <section>
>> <h1>...</h1>
>> <section>
>> <h2>foo</h2>
>> </section>
>> </section>
>> </body>
>>
>
> Even if the LEVEL is affected by sections, h1-h6 are still headings.
>
> Using hgroup would alter the meaning in a more drastic way.

They are grouped into one logical heading.

>>> While this semantic construct may be relatively easy understood by web
>>> designers working on html code it is a nightmare for developers who
>>> make tools to parse that markup and make something useful of it. This
>>> includes:* JavaScripts that make document outlines client side* Code
>>> written in PHP, Java, Python, Perl, Ruby, etc that make document
>>> outlines server side* Browsers that should present document outlines
>>> to assistive technology, like screen readers
>>
>> These should all implement the outline algorithm if they want to work
>> correctly, even if hgroup is dropped.
>
> Yes, but the complexity of that algorithm would increase dramatically.

Not really (see below).

> My argument was that such an *extra* layer of complexity is bad.
>
>
>>> Currently this simple code is unambiguous:
>>> document.querySelectorAll("h2")
>>
>> Assuming this wants to select all second-level headings, it is wrong
>> even without hgroup (see above).
>
> Assuming one would like to select h2 *headings* it is unambiguous.
> Assuming one would like to select second level *headings* it is not.
>
>>> However, when hgroup has been introduced it will need to be rewritten
>>> into something very complex, like
>>> document.querySelectorAll(":not(hgroup) > h2, hgroup > h2:first-child")
>>
>> This is also wrong (if we ignore the fact that h2 might not be a
>> second-level heading, hgroup's rank is determined by its highest ranked
>> child heading, not its first child).
>>
>
> I could not explain the fact that hgroup is a very complex technique  
> better myself ;-)
>
>>> Complexity always comes with an increased risk for bugs.
>>
>> The outline algorithm is still on the same order of complexity even if
>> hgroup is dropped.
>
> Is it really? I would very much like to see two scripts that produce  
> outlines, one with and one without hgroup, side by side yo prove that  
> point.

OK.

http://code.google.com/p/h5o/

Index: h5o-js/src/Section.js
===================================================================
--- h5o-js/src/Section.js (revision 71)
+++ h5o-js/src/Section.js (working copy)
@@ -46,9 +46,6 @@
  var _sectionHeadingText = function(sectionHeading)
  {
   if (isHeading(sectionHeading)) {
-  if (_getTagName(sectionHeading)=='HGROUP') {
-   sectionHeading =  
sectionHeading.getElementsByTagName('h'+(-_getHeadingElementRank(sectionHeading)))[0];
-  }
    // @todo: try to resolve text content from img[alt] or *[title]
    return sectionHeading.textContent || sectionHeading.innerText || "<i>No  
text content inside "+sectionHeading.nodeName+"</i>";
   }
Index: h5o-js/src/func.js
===================================================================
--- h5o-js/src/func.js (revision 71)
+++ h5o-js/src/func.js (working copy)
@@ -14,7 +14,7 @@
   
   var isSecRoot =  
_createTagChecker('^BLOCKQUOTE|BODY|DETAILS|FIELDSET|FIGURE|TD$'),
    isSecContent= _createTagChecker('^ARTICLE|ASIDE|NAV|SECTION$'),
-  isHeading = _createTagChecker('^H[1-6]|HGROUP$'),
+  isHeading = _createTagChecker('^H[1-6]$'),
    isElement = function(obj) { return obj && obj.tagName; };
   
   /*
@@ -40,15 +40,7 @@
   var _getHeadingElementRank = function(el)
   {
    var elTagName = _getTagName(el);
-  if (elTagName=='HGROUP') {
-   /* The rank of an hgroup element is the rank of the highest-ranked  
h1-h6 element descendant of the hgroup element, if there are any such  
elements, or otherwise the same as for an h1 element (the highest rank). */
-   for (var i=1; i <= 6; i++) {
-    if (el.getElementsByTagName('H'+i).length > 0)
-     return -i;
-   }
-  } else {
-   return -parseInt(elTagName.substr(1));
-  }
+  return -parseInt(elTagName.substr(1));
   };
   
   var _lastSection = function (outlineOrSection)


I'll leave it to someone interested to apply the patch and compare the  
complexity and performance difference.


> Also I would like to see 2 scripts that extract certain headers, one  
> with and one without hgroup, to prove that statement.

I believe the above script can be used to do that.

>>> For long documents this complexity also mean that scripts will execute
>>> slower. And the longer the document, the more advantageous it is to
>>> add an outline!
>>
>> Removing hgroup will not change the performance characteristics of the
>> outline algorithm substantially.
>
> At least dropping hgroup will make the performance somewhat better.  
> Until we have tests we can not say for certain.

At least now there's a starting point. :-)

>>> There already exist scripts, both client and server side, that
>>> generate outlines. All of these scripts must be updated for hgroup to
>>> work, as well as browsers.
>>
>> They must be updated to handle sectioning elements anyway.
>
> Yes, but without hgroup the job will be simpler.
>
> -- cut---
>
>>> There are in fact two extremes on a scale when it comes to subtitles:*
>>> True subtitles that are actually part of the title: "Dr Strangelove or
>>> how I stopped worrying and started to love the bomb"* Tag lines:
>>> "Alien - In space no one can hear you scream"
>>
>> I think using different markup for the two use cases would be highly
>> confusing (not to mention the permathreads about which markup to use for
>> a given case).
>
> And no specific suggestion was made. But it illustrates the fact that  
> subtitles are complex in ordinary language and that some more research  
> would be good before we decide on a solution.
>
>>> Markup generated in WYSIWYG environments will probably be messed up
>>> with hgroup. First of all it affects workflow:
>>
>> I don't see why the workflow would need to be different compared to a
>> different element/markup pattern for subtitles.
>
> My remark was the result of trying to write an UI (on paper) for hgroup.  
> No matter how I did it, I could not get away from the fact that the  
> intuitive way of working was to select the text for the subtitle and  
> chose "set as subtitle", which in turn would trigger backstepping and  
> wrapping of the selected text and the heading preceding it in hgroup.
>
> Thus, the UI is totally disconnected from the markup it produces.
>
>>> And what if a user decides that the subtitle should be dropped? It is
>>> not hard to imagine many pages getting the following code snippets all
>>> over them:
>>
>> It's also not hard to imagine that the WYSIWYG editor detects that
>> there's only one heading in hgroup and cleans up the markup.
>
>  From my experience with WYSIWYG editors I'd say "don't hold your  
> breath..."
>
> We already see hgroup being abused by authors who code by hand, e.g:
> http://wiki.whatwg.org/wiki/Hgroup_element#Apple
>
> Also, this perfectly illustrates the principle of complexity. Why should  
> such detecting be necessary and when should it be triggered?
>
> And what if a user then changes his or her mind again and CTRL+Z is  
> invoked? The undo-history will also be more complex.

I don't have experience with writing WYSIWYG editors, but it seems to me  
it's on the same order of complexity as when the user removes the last  
list item in a list.

>>> Imagine somebody writing a script that assumes that last-child heading
>>> elements inside hgroup are always subtitles?
>>
>> Such a script would already be broken since the last child is not
>> guaranteed to be the not-highest-ranked heading in hgroup even if there
>> are multiple headings.
>
> Once again, could the complexity of hgroup be better illustrated? This  
> is not an argument in favor of hgroup!
>
>>> Teachability concerns
>>
>> These concerns may very well be valid. It may also apply to headings and
>> sections in general.
>
> Yes, sections are hard to teach. (I have not found a way to do it that I  
> am satisfied with in a book I am writing.)
>
> So why make it even more complicated?

To support the use case of subtitles.

> The rebuttal also fails to address why hgroup is superior to the other  
> patterns, e.g. this one:
>
> <header>
>    <h1 />
>    <p />
> </header>

How do you differentiate that with when you want to use a paragraph after  
a heading in <header>? Would it be possible to include the subheading (as  
part of its section's heading) in the ToC, as suggested in  
http://www.w3.org/Bugs/Public/show_bug.cgi?id=14707 without getting full  
paragraphs in the ToC?

> That solution *is* backwards compatible - hgroup is not. Hint: HTML5  
> design principles...
>
> Since the p-element is semantically made a part of the header, I'd say  
> it perfectly fits as a solutions for subtitles.

You said it was bad that <h2> changed meaning based on its placement. Why  
is it OK for <p> to change meaning based on its placement?

Is it still a subtitle in the following case?:

   <header>
    <h1>...</h1>
    foo
    <p>...</p>
   </header>

-- 
Simon Pieters
Opera Software

Received on Monday, 7 November 2011 06:20:17 UTC