This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 12935 - <rt> should not auto-close ancestors
Summary: <rt> should not auto-close ancestors
Status: RESOLVED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: LC1 HTML5 spec (show other bugs)
Version: unspecified
Hardware: PC All
: P2 blocker
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2011-06-10 17:23 UTC by Boris Zbarsky
Modified: 2011-08-04 05:13 UTC (History)
13 users (show)

See Also:


Attachments

Description Boris Zbarsky 2011-06-10 17:23:50 UTC
We're implementing ruby support in Gecko, and discovered that the HTML5 parsing algorithm requires that <rt> close <rtc>.

This behavior does not match the pre-HTML5-parser Gecko behavior, the pre-HTML5-parser WebKit behavior, Presto, or IE9+.

It does match IE8-.

Given that IE has already changed its behavior here once, I see no reason to require them to change it again to what is arguably a less-useful behavior.  So I think what we should do is to remove the <rt> magic from the spec entirely, simplifying the spec and making it compatible with what IE9, Gecko 1.9.2, Presto, and pre-HTML-parser WebKit.

There will still be the issue that it's not safe for authors to actually use <rtc> until IE8- marketshare becomes negligible, but I would expect the timeframe for this to be on par with the timeframe for HTML5 getting anywhere close to REC if not shorter.  In the meantime, there is clearly not a serious web compat issue with the proposed change, since IE9, Gecko 1.9.2, Presto, and pretty recent WebKit all have/had that behavior.  So there should be nothing stopping Gecko and WebKit from changing away from their current behavior.
Comment 1 Simon Pieters 2011-06-10 17:57:22 UTC
I can't find "rtc" in the spec. Did you mean "rp"?
Comment 2 Boris Zbarsky 2011-06-10 18:02:02 UTC
No, I mean <rtc>.  The spec currently calls for <rt> to close all ancestors to the nearest <ruby> if its inside a <ruby>.  In practice, the only such ancestor people would want to use is <rtc>, when trying to use markup compatible with http://www.w3.org/TR/ruby/

In any case there's nothing <rtc>-specific about this bug.  We should just remove the "<rt> auto-closes things" behavior.
Comment 3 Simon Pieters 2011-06-10 18:15:15 UTC
I see. But you still want </rp> adn </rt> to be implied, right?
Comment 4 Boris Zbarsky 2011-06-11 00:15:07 UTC
Simon, in terms of UA behavior, looks like pre-HTML5-parser Gecko does not imply those, nor does Presto.  pre-HTML5-parser WebKit does imply them, as does IE9.  

I don't have a problem with implying </rp> and </rt>.  I also don't have a problem with not implying them; it looks like either behavior is web-compatible and neither one forecloses anything people might want to do with ruby in the future as far as I can tell.
Comment 5 Ian 'Hixie' Hickson 2011-06-11 01:01:47 UTC
So the current design is intended to make this kind of markup work ok:

   <ruby> base <rt> annotation </ruby>

...as well as make this kind of markup error-correct quickly:

   <ruby> <b>base <rt> annotation </ruby>

I don't feel particularly strongly about this second issue, but it is one that would go away if we change this.

Would simply removing ";pop all the nodes from the current node up to the node immediately before the bottommost ruby element on the stack of open elements" from the "rt" start tag handling be sufficient for your needs?

If so, please assume the spec is so changed. I'll actually make the change once I see your confirmation and have the editor open.
Comment 6 Boris Zbarsky 2011-06-11 03:29:20 UTC
Hmm.  If we do want to handle the second issue, we could just pop to ruby or rtc...

If we don't want to handle it, then I _think_ your proposed change does what I want, but I'd like Henri to confirm: he's a lot more familiar with this neck of the woods than I am.
Comment 7 Henri Sivonen 2011-06-13 11:48:57 UTC
(In reply to comment #6)
> Hmm.  If we do want to handle the second issue, we could just pop to ruby or
> rtc...

If we are confident that we don't want to introduce more elements than what the old Ruby spec had and if we like implied tags, then it would make sense to:
 1) Make rt pop until ruby *or* rtc whichever is seen first (still only if there's ruby in scope).
 2) Make rb pop until ruby *or* rbc whichever is seen first (still only if there's ruby in scope).
 3) Make rbc and rtc behave like rp behaves now.

> If we don't want to handle it, then I _think_ your proposed change does what I
> want, but I'd like Henri to confirm: he's a lot more familiar with this neck of
> the woods than I am.

If we only did what Hixie said, we'd lose implicit closing of rb and rp in the simple Ruby case.

How much do we want to keep the simple Ruby case old IE-like and how much do we want to support tag omission?
Comment 8 Simon Pieters 2011-06-13 13:37:07 UTC
(In reply to comment #7)

> If we only did what Hixie said, we'd lose implicit closing of rb and rp in the
> simple Ruby case.

Nope, "generate implied end tags" still closes rb and rp.

I'm not convinced there's a need to aggressively pop stuff inside ruby, so I think Hixie's suggestion is a good one.
Comment 9 fantasai 2011-06-13 16:29:52 UTC
I can't say I'm too familiar with the way HTML parsing works, but I would assume:
  - We won't need any markup beyond what's in XHTML Ruby Annotation
  - We're likely to need <rb> and <rtc>, if not now, then at some point in the
    future.

<rbc> seems mostly useless to me, but could be kept around for compat with XHTML Ruby Annotation markup.

Wrt mixing other elements and ruby, I've heard it suggested for e.g. Chinese to mark up an entire paragraph with phonetics by placing the entire paragraph inside <ruby>. If you want to support that, then you don't actually want to have <rt> auto-close a <b>.

My uninformed opinion is that you should have any ruby markup's closing tag auto-close anything that was opened inside it. I don't think we need to support mis-nested tags here.

In an ideal world I would have:
  * <rt> autoclose <rt>, <rb>, <rp>, <rbc>
  * <rb> autoclose <rt>, <rb>, <rp>, <rtc>
  * <rp> autoclose <rt>, <rb>, <rp>, <rbc>
  * <rtc> autoclose <rt>, <rb>, <rp>, <rtc>, <rbc>
  * <rbc> autoclose <rt>, <rp>, <rtc>, <rbc>
  * Any ruby closing tag (including an autoclose) close anything opened
    inside that element
Comment 10 contributor 2011-06-13 19:32:42 UTC
Checked in as WHATWG revision r6215.
Check-in comment: Remove some error-handling parsing behaviour near <ruby> elements, for forwards-compatibility.
http://html5.org/tools/web-apps-tracker?from=6214&to=6215
Comment 11 Ian 'Hixie' Hickson 2011-06-13 19:37:03 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Accepted
Change Description: see diffs above and below
Rationale: see comment 8
Comment 12 contributor 2011-06-13 19:42:01 UTC
Checked in as WHATWG revision r6216.
Check-in comment: Remove some error-handling parsing behaviour near <ruby> elements, for forwards-compatibility. (See also previous checkin; sorry about the erroneous annotations therein)
http://html5.org/tools/web-apps-tracker?from=6215&to=6216
Comment 13 Henri Sivonen 2011-06-14 08:52:14 UTC
(In reply to comment #8)
> (In reply to comment #7)
> 
> > If we only did what Hixie said, we'd lose implicit closing of rb and rp in the
> > simple Ruby case.
> 
> Nope, "generate implied end tags" still closes rb and rp.

Actually, "generate implied end tags" closes rp and rt but not rb. Hixie, I think rb should be added to "generate implied end tags".
Comment 14 Henri Sivonen 2011-06-14 09:35:03 UTC
(In reply to comment #13)
> Actually, "generate implied end tags" closes rp and rt but not rb. Hixie, I
> think rb should be added to "generate implied end tags".

Moreover, after experimenting with an implementation, I think rb should behave exactly like rt and rp everywhere in the tree builder.

First, it would be bizarre if <rt> closed <rp> but didn't close <rb>. Second, it would be bizarre if <rt> closed <rb> but <rb> didn't close <rb> when there are multiple <rb>s inside <rbc>.
Comment 15 Henri Sivonen 2011-06-14 11:22:07 UTC
(In reply to comment #14)
> (In reply to comment #13)
> > Actually, "generate implied end tags" closes rp and rt but not rb. Hixie, I
> > think rb should be added to "generate implied end tags".
> 
> Moreover, after experimenting with an implementation, I think rb should behave
> exactly like rt and rp everywhere in the tree builder.
> 
> First, it would be bizarre if <rt> closed <rp> but didn't close <rb>. Second,
> it would be bizarre if <rt> closed <rb> but <rb> didn't close <rb> when there
> are multiple <rb>s inside <rbc>.

And even with that change, this still produces a surprising result:
<ruby><rbc><rb>a<rb>b<rb>c</rbc><rp>(<rtc><rt>e<rt>f<rt>g</rtc><rp>)</ruby>
(rtc goes inside an rp.)

Reopening for more careful consideration. It seems like rtc (maybe rbc) should generate implied end tags.

bz, fantasai: Does it still make sense to even try to implement Complex Ruby? Isn't it always possible to decompose Complex Ruby into several Simple Rubies that would have the benefit of working in legacy IE?

That is, why would you write 
<ruby><rbc><rb>A<rb>B</rbc><rtc><rt>a<rt>b</rtc></ruby> instead of 
<ruby>A<rt>a</ruby><ruby>B<rt>b</ruby>? Are there line breaking or styling benefits from using a single Complex Ruby instead of multiple Simple Rubies?

I realize that the design of <rp> makes Complex Ruby degrade more gracefully when the degradation target is a browser with no Ruby support whatsoever, but it's more realistic to expect that at this point, the main degradation target will be legacy IE (unless old versions the Android default browser manage to become a worse problem than legacy IE...).
Comment 16 fantasai 2011-06-14 19:31:24 UTC
> this still produces a surprising result:

See comment 9. You're missing other auto-closes. :)

> why would you write ...

Brain dump on ruby markup: http://fantasai.inkedblade.net/weblog/2011/ruby/

Sorry it took me so long. Marking up all the examples and drawing pictures took awhile...
Comment 17 Boris Zbarsky 2011-06-14 19:41:53 UTC
> Isn't it always possible to decompose Complex Ruby into several Simple Rubies

As I understand, no (e.g. doing that for "Tokyo" doesn't work, I'm told).  But this is all second-hand in my case.
Comment 18 Ian 'Hixie' Hickson 2011-06-14 19:48:26 UTC
EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Rejected
Change Description: no spec change
Rationale: I've no intention of ever adding <rb> or <rtc>. This is a minor feature and should not be overengineered. Complex ruby is a classic case of the danger of overspecialisation (where a group of people are experts in one area and come up with a design to solve 100% of use cases, without seeing that in the big picture the feature as a whole is rather minor and doesn't need to cover more than 80% of use cases). HTML should remain simple, if you want something that covers more semantics, use DocBook.
Comment 19 Boris Zbarsky 2011-06-14 19:57:00 UTC
Going to escalate this to the tracker, I'm afraid.  As I said, I have no opinions on whether complex ruby is something we want to add, but I'm opposed to us foreclosing the option now due to an auto-closing behavior whose benefits are somewhat dubious to start with.

Tracker issue title: "HTML parsing spec should not prevent future implementation of complex ruby or other development or ruby markup".

Tracker issue text: "The current behavior of <rt> closing all ancestor tags up to the <ruby> is not compatible with the existing complex ruby proposals, nor with the paragraph-level markup suggestions from comment 9".

On a personal note, I think that suggesting people use DocBook for cases like "write a web page for for kids that includes the word 'Tokyo'" is ridiculous.  It's somewhat unfortunate that some of the writing systems involved are so complicated and the complications seep into HTML, but rejecting the authoring of reasonable HTML documents in these writing systems is on its face a bad idea in my view.
Comment 20 Simon Pieters 2011-06-15 11:57:02 UTC
(In reply to comment #19)
> Going to escalate this to the tracker, I'm afraid.  As I said, I have no
> opinions on whether complex ruby is something we want to add, but I'm opposed
> to us foreclosing the option now due to an auto-closing behavior whose benefits
> are somewhat dubious to start with.
> 
> Tracker issue title: "HTML parsing spec should not prevent future
> implementation of complex ruby or other development or ruby markup".
> 
> Tracker issue text: "The current behavior of <rt> closing all ancestor tags up
> to the <ruby> is not compatible with the existing complex ruby proposals, nor
> with the paragraph-level markup suggestions from comment 9".

The spec was changed already. <rt> doesn't close all ancestors up to <ruby> anymore.

http://html5.org/tools/web-apps-tracker?from=6215&to=6216
Comment 21 Boris Zbarsky 2011-06-15 16:20:01 UTC
Then what exactly does comment 18 mean?
Comment 22 Yuhong Bao 2011-06-15 23:48:27 UTC
(In reply to comment #20)
> The spec was changed already. <rt> doesn't close all ancestors up to <ruby>
> anymore.
> 
> http://html5.org/tools/web-apps-tracker?from=6215&to=6216

Changing this to RESOLVED FIXED.
Comment 23 Yuhong Bao 2011-06-15 23:53:00 UTC
(In reply to comment #20)
> The spec was changed already. <rt> doesn't close all ancestors up to <ruby>
> anymore.
> 
> http://html5.org/tools/web-apps-tracker?from=6215&to=6216

That was a comment made by the editor, I think. Probably the editor did not realize it was already fixed.
Comment 24 Boris Zbarsky 2011-06-16 01:12:30 UTC
> Probably the editor did not realize it was already fixed.

The editor made the change you're quoting.  The chance that he did not realize he had made the change is about 0.  So I'd still like to understand what comment 18 is about.  I'd also really appreciate it if people didn't mess with bug metadata based on assumptions about what "probably" happened.
Comment 25 Simon Pieters 2011-06-16 11:27:13 UTC
Changing back to WONTFIX.

If you read all the comments you will see what happened. First, Hixie made the change to the spec (comment 10). Then Henri reopened the bug and asked for consideration about further changes (comment 13), which the editor rejected.
Comment 26 Henri Sivonen 2011-06-16 14:11:12 UTC
(In reply to comment #16)
> > this still produces a surprising result:
> 
> See comment 9. You're missing other auto-closes. :)

I have an experimental build of Firefox that does the following:
 * rbc, rtc, rp, rb and rt all go to the list of elements that are closed by "generate implied end tags".
 * If there is a ruby element in scope, rbc, rtc and rp start tags generate implied end tags.
 * If there is a ruby element in scope, rb generates implied end tags except for rbc.
 * If there is a ruby element in scope, rt generates implied end tags except for rtc.

Parse errors not figured out yet.

The above seems to lead to sensible behavior. That is, Simple Ruby with tag omission targeted for IE8 and earlier would still work. Yet, all of Complex Ruby would be possible, too, all with optional end tags. Some Simple Ruby with omitted end tags would fail to degrade gracefully in IE9, though, but I don't have much sympathy for IE9 going and cloning old WebKit instead of implementing the spec and giving spec feedback.

Opinions? Note that "generate implied end tags except for foo" where foo is a single element name is a pre-existing tree builder pattern.

Technically, it would be possible to imply <ruby> when there isn't already a ruby element in scope, but I'm hesitant to do that, because authors seem to find it surprising that you can omit <html>, <head>, <body> and <tbody>.

> > why would you write ...
> 
> Brain dump on ruby markup: http://fantasai.inkedblade.net/weblog/2011/ruby/

Thank you.
Comment 27 Julian Reschke 2011-06-16 16:42:44 UTC
I believe the TrackerRequest keyword was removed without agreement from Boris Z.
Comment 28 Boris Zbarsky 2011-06-16 16:48:04 UTC
Actually, the current state of the spec reflects what I filed the bug about sufficiently that I feel no need to escalate this.  Had I wanted to restore the keyword, I would have. ;)
Comment 29 Julian Reschke 2011-06-16 16:50:49 UTC
(In reply to comment #27)
> I believe the TrackerRequest keyword was removed without agreement from Boris
> Z.

Furthermore, setting this to "WONTFIX" when in fact changes were made is highly confusing.
Comment 30 Ian 'Hixie' Hickson 2011-06-16 17:14:57 UTC
That's why I prefer that people file just one issue per bug. :-)
Comment 31 Boris Zbarsky 2011-06-16 17:17:32 UTC
OK, I talked this over with Ian.  The wontfix was for comment 15.

Henri, if you think there should be more changes here, could you please file new bugs instead of reopening this one?
Comment 32 Sam Ruby 2011-06-17 12:09:48 UTC
Removed TrackerRequest (again) as Henri indicated to me that he intends to pursue a separate bug
Comment 33 Henri Sivonen 2011-07-01 13:34:13 UTC
Bug 13113 filed as the follow-up.
Comment 34 Michael[tm] Smith 2011-08-04 05:13:35 UTC
mass-move component to LC1