24369 – Reason for ‘ruby base span’ attribute to come back

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24369 - Reason for ‘ruby base span’ attribute to come back

Summary: Reason for ‘ruby base span’ attribute to come back

Status:	RESOLVED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Robin Berjon
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-01-22 23:37 UTC by Chen Yijun
Modified:	2016-04-20 20:16 UTC (History)
CC List:	8 users (show)

See Also:

Attachments

Description Chen Yijun 2014-01-22 23:37:54 UTC

The attribute `rbspan` for ruby text is quite important for some use cases, i.e., to write a reference-like ruby syntax for coding readability and friendlier supports for ruby-less or ruby-disabled web browsers. (I've discussed this issue in public-i18n-cjk mailing list[1].)

[1]: http://lists.w3.org/Archives/Public/public-i18n-cjk/2013OctDec/0052.html

<p><ruby>
    <rb>明朝<rb>是<rb>中國<rb>歷史<rb>上<rb>最後<rb>一個<rb>由<rb>漢族<rb>建立<rb>的<rb>中原王朝</rb>，<rb>歷經</rb>12<rb>世</rb>、16<rb>位<rb>皇帝。明朝初期定都於應天府，1421年明成祖遷都至順天府。1368年，朱元璋在統一農民起義軍後，在應天府登基，國號大明。明朝初年，國力迅速恢復，經過明太祖朱元璋的洪武之治，勵精圖治並逐步恢復國力。

    <rtc lang=“zh-cmn-Latn">
        <rp lang=“zh-cmn”>（<strong>上方段落的普通話漢語拼音：</strong></rp>
        <rt>mingchao
        <rt>shi
        <rt>zhongguo
        <rt>lishi
        <rt>shang
        <rt>zuihou
        <rt>yige
        <rt>you
        <rt>hanzu
        <rt>jianli
        <rt>de
        <rt>zhongyuanwangchao<rp>, </rp>
        <rt>lijing
        <rp>12</rp>
        <rt>shi
        ……
        <rp>）</rp>
    </rtc>

    <rtc lang=“nan-Latn”>
        <rp lang=“zh-cmn">（<strong>上方段落的閩南語羅馬拼音：</strong></rp>
        <rt>bin-tiau
        <rt>si
        ……
        <rp>）</rp>
    </rtc>
</ruby></p>

From the code example above, we can tell that reference-like ruby is way easier to read than `<rb><rt>`-style syntax within a fairly long paragraph. We can make the best use of it with `rbspan` attribute, especially for complex ruby.

Also, for browsers that supports no ruby, or those users who disabled the feature, we can simply display annotations paragraphically (with proper punctuation and explanation in `<rp>`s) instead of following each character or phrase to distract.

Further more, if we wrap each word/phrase within each ruby element, it would be impossible to add cross-phrase elements round them.

From the code block below, if the author plans to add a hyperlink to the text ‘有聽著’ in the sentence ‘`你~`敢有~`聽著~`咱~`的~`歌~’, since the text are in different ruby elements respectively, they'll be forced to separate one hyperlink into two.

<ruby>你</ruby>
<ruby>敢<a href="#yes-i-do”>有</a></ruby>
<a href=“#yes-i-do”><ruby>聽著</ruby></a>
<ruby>咱</ruby>
<ruby>的</ruby>
<ruby>歌</ruby>？

Improvements,

<ruby>
    <rb>你
    <rb>敢
    <a href="#yes-i-do”>
        <rb>有
        <rb>聽<rb>著
    </a>
    <rb>咱
    <rb>的
    <rb>歌</rb>？

    <rtc>……</rtc>
    <rtc>……</rtc>
</ruby>

One hyperlink provides better semantic structure; while two break the simplicity of the syntax. The behaviour of the links would be a bit nonsense as well (such as hover, active and focus events, etc).

These could only be realistic with the existence of `rbspan`.

Comment 1 Richard Ishida 2014-03-25 18:01:00 UTC

There may be more than one bug rolled together here. 

I guess my first question is how widespread is the use case for putting all the annotations at the end of the paragraph? I can see how there is more of an opportunity for this in Chinese than in Japanese (which has lots of kana gaps), but is it really a common approach, or just something that 'you could do'.

Over the years I have come across people suggesting all sorts of possible ways to stretch ruby, with a little shoe-horning, to do things that were outside its main use cases (eg. phonetic descriptions, linguistic glosses, ...).  Adding functionality to cover all those requirements is likely to add several more years before ruby is implemented, given the way things have been progressing, so recently I've been concentrating on pushing for support for the core use cases, so that at least we can cover the the majority of needs in a timely way.

Note, also, that you can handle some of the rbspan issues by using empty rt elements, rather than rp. I think this avoids twisting the semantics. For example, you can associate the 

<ruby><rb>明朝<rb>是<rb>中國<rb>歷史<rb>上<rb>最後<rb>一個<rb>由<rb>漢族<rb>建立<rb>的<rb>中原王朝<rb>，<rb>歷經<rb>12<rb>世<rb>、16<rb>位<rb>皇帝。
<rt></ruby>

    <rtc lang=“zh-cmn-Latn">
        <rp lang=“zh-cmn”>（<strong>上方段落的普通話漢語拼音：</strong></rp>
        ...
        <rt>you
        <rt>hanzu
        <rt>jianli
        <rt>de
        <rt>zhongyuanwangchao
        <rt>
        <rt>lijing
        <rt>
        <rt>shi
        ...
        <rp>）</rp>
    </rtc>


Where you run into problems is when you have double-sided ruby that doesn't match and you want to use these really long runs of rb followed by rt.

Comment 2 Arron Eicholz 2016-04-20 20:16:31 UTC

HTML5.1 Bugzilla Bug Triage: 

This bug constitutes a request for a new feature of HTML. Our current guidelines, rather than track such requests as bugs or issues, is to create a proposal for the desired behavior, or at least a sketch of what is wanted (much of which is probably contained in this bug), and start the discussion/proposal in the WICG (https://www.w3.org/community/wicg/). As your idea gains interest and momentum, it may be brought back into HTML through the Intent to Migrate process (https://wicg.github.io/admin/intent-to-migrate.html).