Rules for Simple Placement of Japanese Ruby

W3C First Public Working Draft

This version:
https://www.w3.org/TR/2020/WD-simple-ruby-20200609/
Latest published version:
https://www.w3.org/TR/simple-ruby/
Latest editor's draft:
https://w3c.github.io/simple-ruby/
Editors:
Florian Rivoal (Invited Expert)
Atsushi Shimono (W3C)
Richard Ishida (W3C)
Author:
Toshi Kobayashi
Participate:
GitHub w3c/simple-ruby
File a bug
Commit history
Pull requests

This document is also available in this non-normative format: Original Japanese (PDF)


Abstract

A simple set of rules for placement of Ruby text in Japanese typography.

Status of This Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

This document was initially written in Japanese and translated to English by the Japanese Writing Technology Working Group of the Advanced Publishing Laboratory of Keio University.

It represents the subjective view of its authors and contributors as to one possible approach to address the problem, and does not claim to be the only possible solution. It is submitted to present a non-Japanese speaking audience with this particular approach, and to encourage discussion of this topic.

The original Japanese version is available in PDF format.

This document was published by the Internationalization Working Group as a First Public Working Draft.

GitHub Issues are preferred for discussion of this specification.

Publication as a First Public Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

This document was produced by a group operating under the W3C Patent Policy. The group does not expect this document to become a W3C Recommendation. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 March 2019 W3C Process Document.

1. Purpose of this document

Ruby is the name given to the small annotations in Japanese content that are rendered alongside base text, usually to provide a pronunciation guide, but sometimes to provide other information. (See the article “What is ruby” by the internationalization Working Group for more information.)

As a guide for implementers of rendering engines, this document describes a simple method of ruby composition that can be used for Japanese layout realized with technologies like CSS, SVG, and XML-FO. Unlike JLReq [JLREQ], in this document only one layout method is presented for each alternative, taking into consideration best practices and important points related to Japanese layout. The points taken into consideration are described in § 2. Considerations for the placement rules. In addition, this document provides recommendations for double-sided ruby, where two distinct runs of ruby text are attached to the same ruby base (this is not described in [JLREQ].

[JLREQ] is largely a record of Japanese layout as established for the printing industry. It explains multiple ways to achieve one thing, and sometimes these approaches can be very complex. Ruby is one such case. There are so many factors to consider, and often contradictory requirements (see some examples), leading to a complexity that makes it challenging to automate ruby.

It seems it would be beneficial to come up with a method that is simple and robust, but one that is suitable for automatic processing. In such cases, rather than ideal positioning, we must at least make sure that the positioning causes no misunderstanding.

The following is a proposal for just such a simple processing system. The target audience is implementers and specification writers. It is expected that a full system may be more complex that what is described here, both due to the interaction with other features or other writing systems, and because those designing such system may wish to provide alternative options. Note that the terminology is based on that defined in [JLREQ].

Note: Ruby and interlinear notes

2. Considerations for the placement rules

2.1 The difficulties of ruby processing

When performing ruby layout in Japanese, the following factors need to be considered in order to decide on the placement:

  1. How to handle the correspondence between the base text and the ruby annotation.
  2. What to do when the base text is wider than its ruby annotation.
  3. What to do when the ruby annotation is wider than its base text.
  4. When the ruby annotation is wider than its base text, whether it can be allowed to overhang the characters preceding or following, and whether this affects the position of the base text.

    Note: Ruby annotations that are wider than the base text
  5. When the ruby annotation is wider than its base text and the base text is at the start or the end of the line, whether the base text or the ruby annotation should be aligned with the line edge.

  6. When there are multiple base characters, whether there can be a line wrap opportunity between them.

In movable type typography, such matters were resolved based on generic principles, and could always be corrected during the proofreading phase. Essentially, each case was adjusted individually in a flexible manner.

In computer-based typesetting, the layout needs to be more or less determined based on predetermined rules, but it remains necessary to adjust the results in certain cases, for example by changing the association between base text and the ruby annotation, or by switching to a different placement policy.

When thinking about computing placement for web content, it is not practical to decide on the positioning case by case as was done in movable type typography. It is therefore necessary to decide upon comprehensive rules that provide solutions to all the problems listed above, so that placement may be determined fully automatically. Considering all the possibilities that existed in movable type typesetting, the system to be designed needs to be very complex.

2.2 Considerations for simple placement rules

Here are the fundamental assumptions underlying the simple placement rules. In this document we refer to the ruby annotation and its base text, collectively, as the ruby block.

  1. Ruby is used to display the reading or the meaning of the base characters. Therefore, the number one priority here is to avoid misreadings. Specifically, this method does not allow a ruby annotation which is wider than its base text to overhang the characters preceding or following the ruby block, whether they are kanji or kana characters.

    Note: Overhanging surrounding characters
  2. The method is agnostic to horizontal vs vertical writing, and will use the same logic in either case. Specifically, the center of the ruby string and of the base character string are aligned in the inline direction for mono-ruby.
  3. Processing is done in two steps. In the first step, processing of layout only considers the relative positions of the ruby annotation and its base text. In the second step, layout processing decides the position of the ruby block in the line, taking into consideration the preceding and following characters. On the other hand, relative positions of the ruby annotation and its base text as decided in the first step are not modified in the light of any characters preceding and following the ruby block. Also, when the ruby block is placed at the line head or the line end the method used in this document does not align the first or last character of the base text to the line head or the line end by modifying the relative positions of the ruby annotation and its base text. To summarise, the relative positions decided in the first step are not modified by the second step at all.
  4. Although there are cases where [JLREQ] and [JISX4051] list multiple ways of positioning ruby, this document only describes one method based on the policies described above. Also methods described in this document are mostly chosen from those provided in [JISX4051]. In some cases, this document suggests optional methods to be allowed as implementation defined, such as that a ruby annotation wider than its base text should not overhang any preceding or following kana characters.
  5. There is a demand to use larger (or smaller) font sizes for ruby annotations. In this document, the default font size of a ruby annotation is set to be half of the font size of its base text, and the examples in figures use this default font size. Sizes of spacing adjustments during justification are defined based on the font size of the base text, but not that of the ruby annotation, and this makes layout methods applicable for cases where the font size of the ruby annotation is not half that of its base text.
Note: Wrap opportunities

2.3 Types of ruby

Ruby in Japanese may be divided into the following 3 different types, based on the relationship between the ruby and the base characters (see JLReq “3.3.1 Usage of Ruby”.

  1. Mono-ruby
  2. Jukugo-ruby
  3. Group-ruby
Figure 4 Types of ruby.

Which one to use depends on the relationship between the ruby and the base characters. Mono-ruby is used to connect ruby to a single base character, Jukugo-ruby is used when multiple base characters each have a corresponding ruby and at the same time the whole group needs to be processed together, and group-ruby is used when ruby is attached to a group of base characters together (see Figure 4). Each is used when specified.

3. Rules for Simple Placement of Japanese Ruby

3.1 Ruby character size and character placement

The size of the ruby characters and their placement in the inline direction relative to the base characters is as follows:

  1. The size of the ruby is by default set to half of the size of the base characters.
  2. In vertical text, ruby is placed to the right of the base characters, and the character frame of the ruby is placed flush against the character frame of the base characters.
    Figure 5 Example of vertical ruby.
  3. In horizontal text, ruby is placed to the top of the base characters, and the character frame of the ruby is placed flush against the character frame of the base characters.
    Figure 6 Example of horizontal ruby.

The following sections describe in detail the placement of mono-ruby, jukugo-ruby, and group-ruby. However, since jukugo-ruby is more complex, it is explained last.

3.2 Placement of mono-ruby

Mono-ruby is placed as follows.

To align the following items to the two-step processing method described in § 2.2 Considerations for simple placement rules, points 1, 2, and 3 belong to the first step, and points 4 and 5 belong to the second.

  1. When the ruby annotation consists of two or more characters, each character in the annotation is placed immediately next to its neighboring character, without any inter-character spacing. Furthermore, when the ruby annotation is composed of characters such as Grouped numerals (cl-24), Unit symbols (cl-25), Western word space (cl-26), or Western characters (cl-27) which have their own individual width, they are placed according to each character’s metrics.

    Figure 7 Example of mono-ruby with western characters.
  2. The center of the ruby annotation and of its base text are aligned in the inline direction (see Figure 8).
  3. Since the base character and its associated ruby annotation form a single unit there is no line wrapping opportunity inside mono-ruby.
  4. When the ruby annotation is wider than its base text, the part of the annotation that extends beyond the base text must not overhang the characters preceding or following (see Figure 8). Spacing is introduced accordingly between these preceding or following characters and the base characters.

    Figure 8 Example 1 of mono-ruby where the annotation is wider than its base text.

    However, in the following cases, where punctuation marks like Full stops (cl-06) have blank space before or after the letter face, ruby annotation characters do overhang the preceding or following characters (see Figure 9). (Here, the difference in the processing of punctuation marks, etc. is that they play an important role as sentence breaks, and there is blank space before and after them. It is preferable to maintain constant spacing for such preceding or following characters, as far as possible, especially because when these blank spaces become large, the meaning of the break may change. Also there are no issues like those described in the note in § 2.1 The difficulties of ruby processing, therefore, this method can use a different approach to layout on punctuation marks like Full stops (cl-06).)

    • If the character preceding the base text is one of Closing brackets (cl-02), Full stops (cl-06), Commas (cl-07), Full-width ideographic space (cl-14), or Middle dots (cl-05), then the ruby must hang over the blank portion at the end the character. (This blank portion is usually half the character’s width, except in the case of Middle dots (cl-05) where it is a fourth of the character width). However, if this blank part has been compressed due to justification or similar processing of the line, then the ruby may only hang over the resulting compressed blank spacing (e.g. if it was reduced from half to a quarter em, hang at most a quarter em).
    • If the character following the base text is one of Opening brackets (cl-01) or Full-width ideographic space (cl-14), Middle dots (cl-05), then the ruby must hang over the blank portion at the start the character. (This blank portion is usually half the character’s width for Opening brackets (cl-01), or a quarter of the character’s width for Middle dots (cl-05).) However, if this blank part has been compressed due to justification or similar processing of the line, then the ruby may only hang over the resulting compressed blank spacing (e.g. if it was reduced from half to a quarter em, hang at most a quarter em).
    Figure 9 Example 2 of mono-ruby where the annotation is wider than its base text.
  5. When the ruby annotation is wider than its base text, and the annotation falls at the start of the line, then the start of the ruby annotation is aligned with the line’s start edge (see Figure 10), while if the annotation falls at the end of the line, then the end of the ruby annotation is aligned with the line’s end edge (see Figure 11).

    Figure 10 Example of mono-ruby at the line start.
    Figure 11 Example of mono-ruby at the line end.

3.3 Placement of group-ruby

This section describes placement rules for group-ruby in terms of two groups of characters. Western characters have proportional width and include characters like Grouped numerals (cl-24), Unit symbols (cl-25), Western word space (cl-26), and Western characters (cl-27). Japanese characters have fixed fullwidth size and includes characters like Hiragana (cl-15), Katakana (cl-16), and Ideographic characters (cl-19) (see also 2.1.2 Kanji, Hiragana and Katakana). Western characters are read as clusters of multiple characters, and it is desirable to avoid adding spacing between characters for justification. The way items are positioned depends on how their respective lengths would compare if they were each laid out without any inter-character spacing. When their respective lengths are the same, both are laid out without inter-character spacing and placed such that their respective centers are aligned in the inline direction (see Figure 12). For other cases, the placement depends on the following.

In terms of the above-mentioned two-step processing method described in § 2.2 Considerations for simple placement rules, points (1), (2) and (3) belong to the first step, and (4) and (5) to the second.

Note: Inter-character spacing in group-ruby
  1. When both the ruby annotation and its base text contain Japanese characters, the placement depends on the following:

    • When their respective lengths are the same, both are laid out without inter-letter spacing and placed such that their respective centers in the inline direction are aligned (see Figure 12).

      Figure 12 Example 1 of group-ruby.
    • When the ruby annotation is less wide than its base text, spacing is inserted between every character in the ruby annotation as well as at the start and the end of the ruby annotation so that it becomes the same width as the base text, then their centers in the inline direction are aligned. The size of the spacing inserted between each of the ruby characters is twice the size of the spacing inserted at the end and at the start (see Figure 13).

      Figure 13 Example 2 of group-ruby.

      However, the size of the spacing inserted at the start and end must be capped at no more than half the size of one base character, and the spacing inserted between each ruby character is enlarged to compensate (see Figure 14).

      Figure 14 Example 3 of group-ruby.
    • When the ruby annotation is wider than its base text, spacing is inserted between every character in the base text as well as at the start and the end of the base text so that it becomes the same length as the ruby annotation, then their centers in the inline direction are aligned. The size of the spacing inserted between each of the base characters is twice the size of the spacng inserted at the end and at the start (see Figure 15).

      Figure 15 Example 4 of group-ruby.
  2. Where a ruby annotation consists of Japanese characters and its ruby base text consists of Western characters, the placement depends on the following (see Figure 16):

    • When the ruby annotation is less wide than its base text, spacing is inserted between every character in the ruby annotation as well as at the start and the end of the ruby annotation so that it becomes the same length as the base character string, then their centers in the inline direction are aligned. The size of the spacing inserted between each of the ruby characters is twice the size of the spacing inserted at the end and at the start.
    • When the ruby annotation is wider than its base text, both are laid out without inter-character spacing and placed such that their respective centers in the inline direction are aligned. In this case, the ruby annotation extends beyond its base text.
    Figure 16 Example of ruby with western characters.
  3. Where a ruby annotation consists of Western characters and its ruby base text consists of Japanese characters, the placement depends on the following (see Figure 16):

    • When the ruby annotation is less wide than its base text, both are laid out without inter-character spacing and placed such that their respective centers in the inline direction are aligned.
    • When the ruby annotation is wider than its base text, spacing is inserted between every character in the base text as well as at the start and the end of the base text so that it becomes the same length as the ruby annotation, then their centers in the inline direction are aligned. The size of the spacing inserted between each of the base characters is twice the size of the spacing inserted at the end and at the start.
  4. When the ruby annotation is wider than its base text and extends beyond it, whether and how it hangs over characters preceding or following the base text is handled in the same way as for mono-ruby (see Figure 17). Also, when the ruby annotation is wider than its base character string and extends beyond it and is located at the start or end of the line, the resulting layout is also identical to that for mono-ruby.

    Figure 17 Example of group-ruby where the ruby annotation is wider than its base text.
  5. In the case of group-ruby, the base text and its associated ruby annotation are treated as a unit, so there is no line wrapping opportunity inside either string.

    Note: Wrap opportunities in group-ruby

3.4 Placement of Jukugo-ruby

Jukugo-ruby is placed as follows.

In terms of the above-mentioned two-step processing method described in § 2.2 Considerations for simple placement rules, points (1), (2) and (3) belong to the first step, and (4) and (5) to the second.

  1. With jukugo-ruby, each base text is associated with its own ruby annotation. When the length of each of these ruby annotations laid out without inter-character spacing is less wide than the length of all their corresponding base texts, placement is determined as follows:

    • When the ruby annotation associated with an individual base text is 1 character long, the ruby annotation and the base text are placed such that their respective centers in the inline direction are aligned (see Figure 19).

      Figure 19 Example 1 of jukugo-ruby.
    • When the ruby annotation associated with an individual base character is 2 characters long or more, the ruby annotation is laid out without inter-character spacing, and placed such that its center and the center of its base text are aligned in the inline direction (see Figure 19).

  2. For simple ruby implementations, if even a single ruby annotation is longer than its corresponding base text when laid out without inter-character spacing, the resulting layout would look identical to group-ruby (see Figure 20 and Figure 21).

    Figure 20 Example 2 of jukugo-ruby.
    Figure 21 Example 3 of jukugo-ruby.
  3. With jukugo-ruby, individual base texts and their associated ruby annotations are treated as a unit, and line wrap opportunities are allowed between two base texts. When such a line wrap occurs, if a single base text that is part of the jukugo is placed alone at the end or at the start of a line, it is laid out identically to mono-ruby; conversely when several base texts that are part of the jukugo are placed together at the end or start of a line, they are laid out together as has been described in this section about jukugo-ruby (see Figure 22).

    Figure 22 Example of wrapping jukugo-ruby.
  4. When the ruby annotation is wider than its base text and extends beyond it, whether and how it hangs over characters preceding or following the base text is handled in the same way as for mono-ruby. Also, when the ruby annotation is wider than its base text, extends beyond it, and is located at the start or end of the line, the resulting layout is also identical to that of mono-ruby.

4. Placement of Double-Sided Ruby

4.1 Placement of double-sided ruby by combinations of ruby types

Quite complex methods are required to apply the full rules for placement of double-sided ruby. For simple placement of double-sided ruby, rules can be written for each of the combinations of mono-ruby, group-ruby, and jukugo-ruby for two sides. As we are using two-step processing, consideration of ruby annotations that extend beyond the ruby base text and their relationship with preceding and following characters, or placement at the line head or the line end, are processed in the same way as when the ruby string is annotations on one side only.

Note: Space between adjacent lines and double-sided ruby

4.2 Double-sided ruby combinations

Double-sided ruby can be classified into five combinations:

  1. mono-ruby and mono-ruby
  2. group-ruby and group-ruby
  3. mono-ruby and group-ruby
  4. mono-ruby and jukugo-ruby
  5. jukugo-ruby with group-ruby or jukugo-ruby

4.3 Combinations of ruby types and placement methods

[JISX4051] provides procedures for handling (1), (2), and (3) (see a note in JLReq). But (3) is actually handled as (2) by first concatenating mono-ruby strings to form a single group-ruby.

We propose to handle (4) as (1) by first converting jukugo-ruby to mono-ruby, and to handle (5) as (2) by first converting jukugo-ruby to group-ruby.

This document, therefore, provides rules for simple placement of double-sided ruby using cases (1) and (2).

Which side the ruby annotations should be placed depends on the specification.

4.4 Placement of the combination mono-ruby and mono-ruby

In the combination mono-ruby plus mono-ruby, ruby annotations are set solid, and are placed so that their center aligns with that of the ruby base text (see Figure 23). For other points, follow the rules for placement of mono-ruby described in § 3.2 Placement of mono-ruby.

Figure 23 Double-sided ruby where both sides are mono-ruby.

4.5 Placement of the combination group-ruby and group-ruby

When both of the ruby annotations are less wide than the ruby base text, follow the rules for placement of group-ruby described in § 3.3 Placement of group-ruby. When the ruby annotation consists of Japanese characters, defined in § 3.3 Placement of group-ruby, spacing is inserted between every character in the ruby string as well as the start and the end of the ruby string (see Figure 24).

Figure 24 Double-sided ruby where both sides are group-ruby.

When one of the ruby annotations is wider than its base text, pick up the ruby annotation with longer length and place that ruby annotation following the rules for placement of group-ruby, as described in § 3.3 Placement of group-ruby. When the ruby base text consists of Japanese characters, as defined in § 3.3 Placement of group-ruby, spacing is inserted between every character in the ruby base text as well as at the start and the end of the ruby base text. Following placement of the ruby base text, place the shorter ruby annotations based on the length of the ruby base text without spacing at the start and the end but with inter-character spacing when the ruby base text is Japanese characters.

When the width of the shorter ruby annotation is longer than its ruby base text with inter-character spacing, the shorter ruby annotation is set solid and is placed so that its center matches that of its base text (see Figure 25).

Figure 25 Second example of double-sided ruby where both sides are group-ruby.

When the width of the shorter ruby annotation is shorter than its base text with inter-character spacing, follow the rules for the placement of group-ruby described in § 3.3 Placement of group-ruby, using the length of the ruby base text with inter-character spacing. When the shorter ruby string consists of Japanese characters, as described in § 3.3 Placement of group-ruby, spacing is inserted between every character in the ruby annotation as well as at the start and the end of the ruby annotations (see Figure 26).

Figure 26 Third example of double-sided ruby where both sides are group-ruby.

For other points, follow the same rules as for the placement of group-ruby described in § 3.3 Placement of group-ruby.

A. References

A.1 Informative references

[JISX4051]
Formatting rules for Japanese documents (『日本語文書の組版方法』; JIS X 4051). Japanese Standards Association. 2004.
[JLREQ]
Requirements for Japanese Text Layout. Yasuhiro Anan; Hiroyuki Chiba; Junzaburo Edamoto; Richard Ishida; Tatsuo KOBAYASHI; Toshi Kobayashi; Kenzou Onozawa; Felix Sasaki; Seiichi Kato; Hajime Shiozawa et al. W3C. 3 April 2012. W3C Note. URL: https://www.w3.org/TR/jlreq/