CSS Text Module Level 3

W3C Working Draft,

This version:
https://www.w3.org/TR/2018/WD-css-text-3-20181206/
Latest published version:
https://www.w3.org/TR/css-text-3/
Editor's Draft:
https://drafts.csswg.org/css-text-3/
Previous Versions:
Test Suite:
http://test.csswg.org/suites/css3-text/nightly-unstable/
Issue Tracking:
Tracker
Inline In Spec
GitHub Issues
Editors:
Elika J. Etemad / fantasai (Invited Expert)
(Invited Expert)
Florian Rivoal (Invited Expert)
Suggest an Edit for this Spec:
GitHub Editor

Abstract

This CSS module defines properties for text manipulation and specifies their processing model. It covers line breaking, justification and alignment, white space handling, and text transformation.

CSS is a language for describing the rendering of structured documents (such as HTML and XML) on screen, on paper, etc.

Status of this document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at https://www.w3.org/TR/.

Publication as a Working Draft does not imply endorsement by the W3C Membership. This is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to cite this document as other than work in progress.

GitHub Issues are preferred for discussion of this specification. When filing an issue, please put the text “css-text” in the title, preferably like this: “[css-text] …summary of comment…”. All issues and comments are archived, and there is also a historical archive.

This document was produced by the CSS Working Group (part of the Style Activity).

This document was produced by a group operating under the W3C Patent Policy. W3C maintains a public list of any patent disclosures made in connection with the deliverables of the group; that page also includes instructions for disclosing a patent. An individual who has actual knowledge of a patent which the individual believes contains Essential Claim(s) must disclose the information in accordance with section 6 of the W3C Patent Policy.

This document is governed by the 1 February 2018 W3C Process Document.

This publication partially addresses the issues in the disposition of comments since the October 2013 Last Call Working Draft, and, while a marked improvement over the previous draft, is not considered to be entirely up-to-date at the time of publication. A completed dispostion of comments and corresponding draft will be published once the issues are fully addressed and reviewed by the CSSWG and Internationalization WG.

The following features are at-risk, and may be dropped during the CR period:

“At-risk” is a W3C Process term-of-art, and does not necessarily imply that the feature is in danger of being dropped or delayed. It means that the WG believes the feature may have difficulty being interoperably implemented in a timely manner, and marking it as such allows the WG to drop the feature if necessary when transitioning to the Proposed Rec stage, without having to publish a new Candidate Rec without the feature first.

1. Introduction

This module describes the typesetting controls of CSS; that is, the features of CSS that control the translation of source text to formatted, line-wrapped text. Various CSS properties provide control over case transformation, white space collapsing, text wrapping, line breaking rules and hyphenation, alignment and justification, spacing, and indentation.

Note: Font selection is covered in CSS Fonts Level 3 [CSS-FONTS-3].

Features for decorating text, such as underlines, emphasis marks, and shadows, (previously part of this module) are covered in CSS Text Decoration Level 3 [CSS-TEXT-DECOR-3].

Bidirectional and vertical text are addressed in CSS Writing Modes Level 3 [CSS-WRITING-MODES-3].

Further information about the typesetting requirements of various languages and writing systems around the world can be found in the Internationalization Working Group’s Typography Index. [TYPOGRAPHY]

1.1. Module Interactions

This module, together with [CSS-TEXT-DECOR-3], replaces and extends the text-level features defined in [CSS2] chapter 16.

In addition to the terms defined below, other terminology and concepts used in this specification are defined in [CSS2] and [CSS-WRITING-MODES-3].

1.2. Values

This specification follows the CSS property definition conventions from [CSS2]. Value types not defined in this specification are defined in CSS Values & Units [CSS-VALUES-3]. Other CSS modules may expand the definitions of these value types.

In addition to the property-specific values listed in their definitions, all properties defined in this specification also accept the CSS-wide keywords keywords as their property value. For readability they have not been repeated explicitly.

1.3. Languages and Typesetting

Authors should language-tag their content accurately for the best typographic behavior.

The content language of an element is the (human) language the element is declared to be in, according to the rules of the document language. For example, the rules for determining the content language of an HTML are defined in [HTML], and the rules for determining the content language of an XML element use are defined in [XML10]. Note that it is possible for the content language of an element to be unknown—e.g. untagged content, or content in a document language that does not have a language-tagging facility is considered to have an unknown content language.

Note: Authors can tag content using the global lang attribute in HTML, the universal xml:lang attribute in XML, and the HTTP Content-Language header for content served over HTTP.

The content language an element is declared to be in also identifies the specific written form of that language used in that element, known as the content writing system.

Note: Depending on the document language's facilities for identifying the content language, information about the writing system may only be carried implicitly. That is typically the case with the [BCP47] language tag used in [HTML], although it can optionally indicate the writing system explicitly using a script subtag.

Language and writing system conventions can affect line breaking, hyphenation, justification, glyph selection, and many other typographic effects. In CSS, language-specific typographic tailorings are only applied when the content language is known (declared). Therefore, higher quality typography requires authors to communicate to the UA the correct linguistic context of the text in the document.

More information about language tags and their interpretation, particularly the use of script tags for atypical language + writing-system combinations, can be found in Appendix F. Tagging Content by Writing System.

1.4. Characters and Letters

The basic unit of typesetting is the character. However, because writing systems are not always as simple as the basic English alphabet, what a character actually is depends on the context in which the term is used. For example, in Hangul (the Korean writing system), each square representation of a syllable (e.g. =Han) can be considered a character. However, the square symbol is really composed of multiple letters each representing a phoneme (e.g. =h, =a, =n) and these also could each be considered a character.

A basic unit of computer text encoding, for any given encoding, is also called a character, and depending on the encoding, a single encoding character might correspond to the entire pre-composed syllabic character (e.g. ), to the individual phonemic character (e.g. ), or to smaller units such as a base letterform (e.g. ) and any combining marks that vary it (e.g. extra strokes that represent aspiration).

In turn, a single encoding character can be represented in the data stream as one or more bytes; and in programming environments one byte is sometimes also called a character.

Therefore the term character is fairly ambiguous where technical precision is required.

For text layout, we will refer to the typographic character unit as the basic unit of text. Even within the realm of text layout, the relevant character unit depends on the operation. For example, line-breaking and letter-spacing will segment a sequence of Thai characters that include U+0E33 THAI CHARACTER SARA AM differently; or the behaviour of a conjunct consonant in a script such as Devanagari may depend on the font in use. So the typographic character represents a unit of the writing system— such as a Latin alphabetic letter (including its diacritics), Hangul syllable, Chinese ideographic character, Myanmar syllable cluster— that is indivisible with respect to a particular typographic operation (line-breaking, first-letter effects, tracking, justification, vertical arrangement, etc.).

Unicode Standard Annex #29: Text Segmentation defines a unit called the grapheme cluster which approximates the typographic character. A UA must use the extended grapheme cluster (not legacy grapheme cluster), as defined in [UAX29], as the basis for its typographic character unit. However, the UA should tailor the definitions as required by typographic tradition since the default rules are not always appropriate or ideal—and is expected to tailor them differently depending on the operation as needed.

Note: The rules for such tailorings are out of scope for CSS.

The following are some examples of typographic character unit tailorings required by standard typesetting practice:

A typographic letter unit or letter for the purpose of this specification is a typographic character unit belonging to one of the Letter or Number general categories in Unicode. [UAX44] See Character Properties for how to determine the Unicode properties of a typographic character unit.

The rendering characteristics of a typographic character unit divided by an element boundary is undefined. Ideally each component should be rendered according to the formatting requirements of its respective element’s properties while maintaining correct shaping and positioning of the typographic character unit as a whole. However, depending on the nature of the formatting differences between its parts and the capabilities of the font technology in use, this is not always possible. Therefore such a typographic character unit may be rendered as belonging to either side of the boundary, or as some approximation of belonging to both. Authors are forewarned that dividing grapheme clusters by element boundaries may give inconsistent or undesired results.

2. Transforming Text

2.1. Case Transforms: the text-transform property

Name: text-transform
Value: none | [capitalize | uppercase | lowercase ] || full-width || full-size-kana
Initial: none
Applies to: inline boxes
Inherited: yes
Percentages: n/a
Computed value: specified keyword
Canonical order: n/a
Animation type: discrete

This property transforms text for styling purposes. It has no effect on the underlying content, and must not affect the content of a plain text copy & paste operation.

Values have the following meanings:

none
No effects.
capitalize
Puts the first typographic letter unit of each word, if lowercase, in titlecase; other characters are unaffected.
uppercase
Puts all letters in uppercase.
lowercase
Puts all letters in lowercase.
full-width
Puts all typographic character units in fullwidth form. If a character does not have a corresponding fullwidth form, it is left as is. This value is typically used to typeset Latin letters and digits as if they were ideographic characters.
full-size-kana
Converts all small Kana characters to the equivalent full-size Kana. This value is typically used for ruby annotation text, where authors may want all small Kana to be drawn as large Kana to compensate for legibility issues at the small font sizes typically used in ruby.

For capitalize, what constitutes a “word“ is UA-dependent; [UAX29] is suggested (but not required) for determining such word boundaries. Authors should not expect capitalize to follow language-specific titlecasing conventions (such as skipping articles in English).

The following example converts the ASCII characters used in abbreviations in Japanese text to their fullwidth variants so that they lay out and line break like ideographs:

abbr:lang(ja) { text-transform: full-width; }

Note: As defined in Text Processing Order of Operations, transforming text affects line-breaking and other formatting operations.

The UA must use the full case mappings for Unicode characters, including any conditional casing rules, as defined in Default Case Algorithm section of The Unicode Standard [UNICODE]. If (and only if) the content language of the element is, according to the rules of the document language, known, then any appropriate language-specific rules must be applied as well. These minimally include, but are not limited to, the language-specific rules in Unicode’s SpecialCasing.txt.

For example, in Turkish there are two “i”s, one with a dot—“İ” and “i”— and one without—“I” and “ı”. Thus the usual case mappings between “I” and “i” are replaced with a different set of mappings to their respective undotted/dotted counterparts, which do not exist in English. This mapping must only take effect if the content language is Turkish written in its modern Latin-based writing system (or another Turkic language that uses Turkish casing rules); in other languages, the usual mapping of “I” and “i” is required. This rule is thus conditionally defined in Unicode’s SpecialCasing.txt file.

The definition of fullwidth and halfwidth forms can be found on the Unicode consortium web site at [UAX11]. The mapping to fullwidth form is defined by taking code points with the <wide> or the <narrow> tag in their Decomposition_Mapping in [UAX44]. For the <narrow> tag, the mapping is from the code point to the decomposition (minus <narrow> tag), and for the <wide> tag, the mapping is from the decomposition (minus the <wide> tag) back to the original code point.

The mappings for small Kana to full-size Kana are defined in Appendix G. Small Kana Mappings.

When multiple values are specified and therefore multiple transformations need to be applied, they are applied in the following order:

  1. capitalize, upercase, and lowercase
  2. full-width
  3. full-size-kana

Text transformation happens after white space processing, which means that full-width only transforms U+0020 spaces to U+3000 within preserved white space.

Note: A future level of CSS may introduce the ability to create custom mapping tables for less common text transforms, such as by an @text-transform rule similar to @counter-style from [CSS-COUNTER-STYLES-3].

3. White Space and Wrapping: the white-space property

Name: white-space
Value: normal | pre | nowrap | pre-wrap | break-spaces | pre-line
Initial: normal
Applies to: inline boxes
Inherited: yes
Percentages: n/a
Computed value: specified keyword
Canonical order: n/a
Animation type: discrete

This property specifies two things:

Values have the following meanings, which must be interpreted according to the White Space Processing and Line Breaking rules:

normal
This value directs user agents to collapse sequences of white space into a single character (or in some cases, no character). Lines may wrap at allowed soft wrap opportunities, as determined by the line-breaking rules in effect, in order to minimize inline-axis overflow.
pre
This value prevents user agents from collapsing sequences of white space. Segment breaks such as line feeds are preserved as forced line breaks. Lines only break at forced line breaks; content that does not fit within the block container overflows it.
nowrap
Like normal, this value collapses white space; but like pre, it does not allow wrapping.
pre-wrap
Like pre, this value preserves white space; but like normal, it allows wrapping.
break-spaces
The behavior is identical to that of pre-wrap, except that:
  • Any sequence of preserved white space always takes up space, including at the end of the line.

  • A line breaking opportunity exists after every preserved white space character, including between white space characters.

As preserved spaces take up space and do not hang, they affect the box’s intrinsic sizes (min-content size and max-content size).

Note: This value does not guarantee that there will never be any overflow due to spaces: for example, if the line length is so short that even a single space does not fit, overflow is unavoidable.

pre-line
Like normal, this value collapses consecutive spaces and allows wrapping, but preserves segment breaks in the source as forced line breaks.

The following informative table summarizes the behavior of various white-space values:

New Lines Spaces and Tabs Text Wrapping End-of-line spaces
normal Collapse Collapse Wrap Remove
pre Preserve Preserve No wrap Preserve
nowrap Collapse Collapse No wrap Remove
pre-wrap Preserve Preserve Wrap Collapse or hang
break-spaces Preserve Preserve Wrap Wrap
pre-line Preserve Collapse Wrap Remove

See White Space Processing Rules for details on how white space collapses. An informative summary of collapsing (normal and nowrap) is presented below:

See Line Breaking for details on wrapping behavior.

4. White Space Processing Details

The source text of a document often contains formatting that is not relevant to the final rendering: for example, breaking the source into segments (lines) for ease of editing or adding white space characters such as tabs and spaces to indent the source code. CSS white space processing allows the author to control interpretation of such formatting: to preserve or collapse it away when rendering the document. White space processing in CSS interprets white space characters only for rendering: it has no effect on the underlying document data.

White space processing in CSS is controlled with the white-space property.

CSS does not define document segmentation rules. Segments can be separated by a particular newline sequence (such as a line feed or CRLF pair), or delimited by some other mechanism, such as the SGML RECORD-START and RECORD-END tokens. For CSS processing, each document language–defined segment break and each line feed (U+000A) in the text is treated as a segment break, which is then interpreted for rendering as specified by the white-space property.

Note: A document parser might not only normalize any segment breaks, but also collapse other space characters or otherwise process white space according to markup rules. Because CSS processing occurs after the parsing stage, it is not possible to restore these characters for styling. Therefore, some of the behavior specified below can be affected by these limitations and may be user agent dependent.

Note: Anonymous blocks consisting entirely of collapsible white space are removed from the rendering tree. Thus any such white space surrounding a block-level element is collapsed away. See [CSS2] section 9.2.2.1

Form feeds (U+000C) (that are not segment breaks) are rendered as a zero-width space (U+200B). Control characters (Unicode category Cc) other than tab (U+0009), line feed (U+000A), and form feed (U+000C), must be rendered as a visible glyph which the UA must synthethize if the glyphs found in the font are not visible and otherwise treated as any other character of the Other Symbols (So) general category and Common script. The UA may use a glyph provided by a font specifically for the control character, substitute the glyphs provided for the corresponding symbol in the Control Pictures block, generate a visual representation of its codepoint value, or use some other method to provide an appropriate visible glyph. As required by [UNICODE], unsupported Default_ignorable characters must be ignored for rendering.

4.1. The White Space Processing Rules

White space processing in CSS affects only the document white space characters: spaces (U+0020), tabs (U+0009), and segment breaks.

Note: The set of characters considered document white space (part of the document content) and that considered syntactic white space (part of the CSS syntax) are not necessarily identical. However, since both include spaces (U+0020), tabs (U+0009), and line feeds (U+000A) most authors won’t notice any differences.

4.1.1. Phase I: Collapsing and Transformation

For each inline (including anonymous inlines; see [CSS2] section 9.2.2.1) within an inline formatting context, white space characters are handled as follows, ignoring bidi formatting characters (characters with the Bidi_Control property [UAX9]) as if they were not there:

Conformance

Document conventions

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Advisements are normative sections styled to evoke special attention and are set apart from other normative text with <strong class="advisement">, like this: UAs MUST provide an accessible alternative.

Conformance classes

Conformance to this specification is defined for three conformance classes:

style sheet
A CSS style sheet.
renderer
A UA that interprets the semantics of a style sheet and renders documents that use them.
authoring tool
A UA that writes a style sheet.

A style sheet is conformant to this specification if all of its statements that use syntax defined in this module are valid according to the generic CSS grammar and the individual grammars of each feature defined in this module.

A renderer is conformant to this specification if, in addition to interpreting the style sheet as defined by the appropriate specifications, it supports all the features defined by this specification by parsing them correctly and rendering the document accordingly. However, the inability of a UA to correctly render a document due to limitations of the device does not make the UA non-conformant. (For example, a UA is not required to render color on a monochrome monitor.)

An authoring tool is conformant to this specification if it writes style sheets that are syntactically correct according to the generic CSS grammar and the individual grammars of each feature in this module, and meet all other conformance requirements of style sheets as described in this module.

Requirements for Responsible Implementation of CSS

The following sections define several conformance requirements for implementing CSS responsibly, in a way that promotes interoperability in the present and future.

Partial Implementations

So that authors can exploit the forward-compatible parsing rules to assign fallback values, CSS renderers must treat as invalid (and ignore as appropriate) any at-rules, properties, property values, keywords, and other syntactic constructs for which they have no usable level of support. In particular, user agents must not selectively ignore unsupported property values and honor supported values in a single multi-value property declaration: if any value is considered invalid (as unsupported values must be), CSS requires that the entire declaration be ignored.

Implementations of Unstable and Proprietary Features

To avoid clashes with future stable CSS features, the CSSWG recommends following best practices for the implementation of unstable features and proprietary extensions to CSS.

Implementations of CR-level Features

Once a specification reaches the Candidate Recommendation stage, implementers should release an unprefixed implementation of any CR-level feature they can demonstrate to be correctly implemented according to spec, and should avoid exposing a prefixed variant of that feature.

To establish and maintain the interoperability of CSS across implementations, the CSS Working Group requests that non-experimental CSS renderers submit an implementation report (and, if necessary, the testcases used for that implementation report) to the W3C before releasing an unprefixed implementation of any CSS features. Testcases submitted to W3C are subject to review and correction by the CSS Working Group.

Further information on submitting testcases and implementation reports can be found from on the CSS Working Group’s website at https://www.w3.org/Style/CSS/Test/. Questions should be directed to the public-css-testsuite@w3.org mailing list.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[CSS-BACKGROUNDS-3]
Bert Bos; Elika Etemad; Brad Kemper. CSS Backgrounds and Borders Module Level 3. 17 October 2017. CR. URL: https://www.w3.org/TR/css-backgrounds-3/
[CSS-BOX-3]
Elika Etemad. CSS Box Model Module Level 3. 9 August 2018. WD. URL: https://www.w3.org/TR/css-box-3/
[CSS-CASCADE-4]
Elika Etemad; Tab Atkins Jr.. CSS Cascading and Inheritance Level 4. 28 August 2018. CR. URL: https://www.w3.org/TR/css-cascade-4/
[CSS-DISPLAY-3]
Tab Atkins Jr.; Elika Etemad. CSS Display Module Level 3. 28 August 2018. CR. URL: https://www.w3.org/TR/css-display-3/
[CSS-FONTS-3]
John Daggett; Myles Maxfield; Chris Lilley. CSS Fonts Module Level 3. 20 September 2018. REC. URL: https://www.w3.org/TR/css-fonts-3/
[CSS-FONTS-4]
John Daggett; Myles Maxfield; Chris Lilley. CSS Fonts Module Level 4. 20 September 2018. WD. URL: https://www.w3.org/TR/css-fonts-4/
[CSS-INLINE-3]
Dave Cramer; Elika Etemad; Steve Zilles. CSS Inline Layout Module Level 3. 8 August 2018. WD. URL: https://www.w3.org/TR/css-inline-3/
[CSS-OVERFLOW-3]
David Baron; Elika Etemad; Florian Rivoal. CSS Overflow Module Level 3. 31 July 2018. WD. URL: https://www.w3.org/TR/css-overflow-3/
[CSS-POSITION-3]
Rossen Atanassov; Arron Eicholz. CSS Positioned Layout Module Level 3. 17 May 2016. WD. URL: https://www.w3.org/TR/css-position-3/
[CSS-RUBY-1]
Elika Etemad; Koji Ishii. CSS Ruby Layout Module Level 1. 5 August 2014. WD. URL: https://www.w3.org/TR/css-ruby-1/
[CSS-SIZING-3]
Tab Atkins Jr.; Elika Etemad. CSS Intrinsic & Extrinsic Sizing Module Level 3. 4 March 2018. WD. URL: https://www.w3.org/TR/css-sizing-3/
[CSS-VALUES-3]
Tab Atkins Jr.; Elika Etemad. CSS Values and Units Module Level 3. 14 August 2018. CR. URL: https://www.w3.org/TR/css-values-3/
[CSS-VALUES-4]
Tab Atkins Jr.; Elika Etemad. CSS Values and Units Module Level 4. 10 October 2018. WD. URL: https://www.w3.org/TR/css-values-4/
[CSS-WRITING-MODES-3]
Elika Etemad; Koji Ishii. CSS Writing Modes Level 3. 24 May 2018. CR. URL: https://www.w3.org/TR/css-writing-modes-3/
[CSS-WRITING-MODES-4]
Elika Etemad; Koji Ishii. CSS Writing Modes Level 4. 24 May 2018. CR. URL: https://www.w3.org/TR/css-writing-modes-4/
[CSS2]
Bert Bos; et al. Cascading Style Sheets Level 2 Revision 1 (CSS 2.1) Specification. 7 June 2011. REC. URL: https://www.w3.org/TR/CSS2/
[CSSOM-1]
Simon Pieters; Glenn Adams. CSS Object Model (CSSOM). 17 March 2016. WD. URL: https://www.w3.org/TR/cssom-1/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[SELECTORS-4]
Elika Etemad; Tab Atkins Jr.. Selectors Level 4. 2 February 2018. WD. URL: https://www.w3.org/TR/selectors-4/
[UAX11]
Ken Lunde 小林劍. East Asian Width. 14 May 2017. Unicode Standard Annex #11. URL: https://www.unicode.org/reports/tr11/tr11-33.html
[UAX14]
Andy Heninger. Unicode Line Breaking Algorithm. 12 June 2017. Unicode Standard Annex #14. URL: https://www.unicode.org/reports/tr14/tr14-39.html
[UAX24]
Mark Davis; Ken Whistler. Unicode Script Property. 26 May 2017. Unicode Standard Annex #24. URL: https://www.unicode.org/reports/tr24/tr24-27.html
[UAX29]
Mark Davis; Laurențiu Iancu. Unicode Text Segmentation. 13 June 2017. Unicode Standard Annex #29. URL: https://www.unicode.org/reports/tr29/tr29-31.html
[UAX44]
Mark Davis; Ken Whistler. Unicode Character Database. 25 September 2013. URL: http://www.unicode.org/reports/tr44/
[UAX9]
Mark Davis; Aharon Lanin; Andrew Glass. Unicode Bidirectional Algorithm. 14 May 2017. Unicode Standard Annex #9. URL: https://www.unicode.org/reports/tr9/tr9-37.html
[UNICODE]
The Unicode Standard. URL: https://www.unicode.org/versions/latest/
[UTR50]
Koji Ishii. Unicode Properties for Vertical Text Layout. 31 August 2013. URL: http://www.unicode.org/reports/tr50/

Informative References

[BCP47]
A. Phillips; M. Davis. Tags for Identifying Languages. September 2009. IETF Best Current Practice. URL: https://tools.ietf.org/html/bcp47
[CSS-COUNTER-STYLES-3]
Tab Atkins Jr.. CSS Counter Styles Level 3. 14 December 2017. CR. URL: https://www.w3.org/TR/css-counter-styles-3/
[CSS-TEXT-DECOR-3]
Elika Etemad; Koji Ishii. CSS Text Decoration Module Level 3. 3 July 2018. CR. URL: https://www.w3.org/TR/css-text-decor-3/
[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[ISO15924]
Code for the representation of names of scripts. International Organization for Standardization. 1998. ISO 15924:1998. Draft International Standard
[JIS4051]
Formatting rules for Japanese documents (『日本語文書の組版方法』). Japanese Standards Association. 2004. JIS X 4051:2004. In Japanese
[JLREQ]
Yasuhiro Anan; et al. Requirements for Japanese Text Layout. 3 April 2012. NOTE. URL: https://www.w3.org/TR/jlreq/
[JUSTIFY]
Elika Etemad; Richard Ishida. Approches to Full Justification. URL: https://www.w3.org/International/articles/typography/justification
[TYPOGRAPHY]
Richard Ishida. International text layout and typography index. 28 August 2018. WD. URL: https://www.w3.org/TR/typography/
[XML10]
Tim Bray; et al. Extensible Markup Language (XML) 1.0 (Fifth Edition). 26 November 2008. REC. URL: https://www.w3.org/TR/xml/
[ZHMARK]
标点符号用法 (Punctuation Mark Usage). 1995. 中华人民共和国国家标准

Property Index

Name Value Initial Applies to Inh. %ages Anim­ation type Canonical order Com­puted value
hanging-punctuation none | [ first || [ force-end | allow-end ] || last ] none inline boxes yes n/a discrete per grammar specified keyword(s)
hyphens none | manual | auto manual inline boxes yes n/a discrete n/a specified keyword
letter-spacing normal | <length> normal inline boxes yes n/a by computed value type n/a an absolute length
line-break auto | loose | normal | strict | anywhere auto inline boxes yes n/a discrete n/a specified keyword
overflow-wrap normal | break-word | anywhere normal inline boxes yes n/a discrete n/a specified keyword
tab-size <number> | <length> 8 block containers yes n/a by computed value type n/a the specified number or absolute length
text-align start | end | left | right | center | justify | match-parent | justify-all start block containers yes see individual properties discrete n/a see individual properties
text-align-all start | end | left | right | center | justify | match-parent start block containers yes n/a discrete n/a keyword as specified, except for match-parent which computes as defined above
text-align-last auto | start | end | left | right | center | justify | match-parent auto block containers yes n/a discrete n/a specified keyword
text-indent [ <length-percentage> ] && hanging? && each-line? 0 block containers yes refers to block container’s own inline-axis inner size by computed value type per grammar computed <length-percentage> value, plus any specified keywords
text-justify auto | none | inter-word | inter-character auto inline boxes yes n/a discrete n/a specified keyword
text-transform none | [capitalize | uppercase | lowercase ] || full-width || full-size-kana none inline boxes yes n/a discrete n/a specified keyword
white-space normal | pre | nowrap | pre-wrap | break-spaces | pre-line normal inline boxes yes n/a discrete n/a specified keyword
word-break normal | keep-all | break-all normal inline boxes yes n/a discrete n/a specified keyword
word-spacing normal | <length-percentage> normal inline boxes yes refers to width of the affected glyph by computed value type n/a the keyword normal or a computed <length-percentage> value
word-wrap normal | break-word | anywhere normal inline boxes yes n/a discrete n/a specified keyword

Issues Index

Comments on how well this would work in practice would be very much appreciated, particularly from people who work with Thai and similar scripts. Note that browser implementations do not currently follow these rules (although IE does in some cases transform the break).
Add example of hanging white space + same example right-aligned.
Any guidance for appropriate references here would be much appreciated.
The rules here are following guidelines from KLREQ for Korean, which don’t allow the Chinese/Japanese-specific breaks. However, the resulting behavior could use some review and feedback to make sure they are correct, particularly when “word basis” breaking is used (word-break: keep-all) in Korean.
It has been proposed that this property could also apply when the white-space property does not allow wrapping, introducing a break anywhere the line would otherwise overflow, but without causing any change to intrinsic size computations. See https://github.com/w3c/csswg-drafts/issues/1171#issuecomment-295522963
If you find any issues, recommendations to add, or corrections, please send the information to www-style@w3.org with [css-text] in the subject line.
Should block and cluster scripts be merged? They have different tolerances for space-justification vs inter-character justification, but both admit both.
THIS CHANGES LIST IS WAY INCOMPLETE PLEASE SEE Disposition of Comments.