List of CSS features required for paged media

Bert Bos

Last modified: $Date: 2016/10/06 16:42:35 $

Summary

This document lists aspects of layout that are of particular importance for paginated display and which could be handled by a future version of CSS. (For example: footnotes and page references.) Some aspects could also be handled to some extent by pre- or post-processors, or by alternative technologies to CSS. (For example: tables of contents or alphabetic indexes.)

Any known proposed solutions are mentioned. Some proposals have been the subject of considerable effort already, even experimental implementations (e.g., simple running headers); some others are no more than vague ideas (e.g., user-defined page templates).

Paginated rendering can be interactive or not. In the former case, some aspects of the interaction can also be considered part of the style. (For example, when rendered through a PDF reader, elements may be bookmarks in the navigation menu; or in a document browser, the page transition may be animated.)

Status: This is just a list where I note ideas when I hear them. To see the state of their development in CSS or other W3C technologies, follow the links in the text. (If there are no links, it usually means there are no published draft specifications yet.)

Still more requirements can be found in Requirements for Latin Text Layout and Pagination, a Note by the Digital Publishing Interest Group.

Table of contents

Running headers and footers

There are three aspects:

  1. The position on the page
  2. The contents
  3. The formatting
A running footer with text in two different fonts. (Left the whole page, right an enlargement of the footer.)
Scan of a page from a magazine. Close-up of the running footer with the text “CONNAISSANCE DES ARTS FÉVRIER 2012” the first three words in a thin serif font and the last two in a bold sans-serif font.

Normal cases

Most printed books (how many? 70%? or 99%?) have relatively simple running headers and footers. The predefined page template in css3-page, with its sixteen margin boxes, is enough for their placement, the 'string-set' property and 'string()' functional notation are enough for their contents, and the properties on the margin boxes are enough for their style.

For example, a very common style is to put page numbers in the top outer corners, and the book title and chapter title centered at the top of even and odd paged respectively:

@page :left {
@top-left {content: counter(page)}
@top-center {content: string(book); font-style: italic}
}
@page :right {
@top-right {content: counter(page)}
@top-center {content: string(chapter); font-style: italic}
}
h1 {string-set: book contents}
h2 {string-set: chapter contents}

The trickiest of the “normal” cases is a running header in the top right corner with both a page number and a chapter title, but one of them in bold. The ”trick” here is to see the chapter title as a left header that happens to have a large and flexible left margin:

@top-left {content: string(chapter); margin-left: auto}
@top-right {content: counter(page); font-weight: bold}

Special, possibly complex content

Another few cases involve content that is given in the document for the specific purpose of serving as a running header. This allows a running header that isn't easily derived from the text that occurs in the document body (a shorter version, a differently worded version) or does not correspond to any text at all (a summary of the page or section, a title for a section that doesn't have a heading, etc.):

...
<section>
 <aside>In which John goes shopping</aside>
 <p>Meanwhile,...
 ...
</section>
<section>
 <aside>The red balloon</aside>
 <p>When he left the...
...

The ASIDE elements in this example are meant as a kind of optional “headings” to be shown in the running header for easy browsing, but without interrupting the flow of the text for a reader.

The proposed running() and element() notations proposed in css3-gcpm can solve this. The style used is the style of the element, not the style of the margin box (except that the margin box serves as containing block):

@page {
@top-center { content: element(subject) }
}
aside { position: running(subject) }

The precise syntax is, of course, still subject to change. E.g., instead of adding the functionality to the 'position' property, it could also be added to 'float', 'display' or 'string-set', e.g., as 'string-set: subject element'.

Note that moving an element from the normal flow to a running header in this way is different from moving it to another flow, as is done with the 'flow' property in templates/regions. The element isn't added to any flow, it is only labeled with a tag (“subject” in the example) and made invisible in the normal flow. The running header picks at most one of the possibly several elements with that tag in all flows on this page and on previous pages. Which one it picks is determined by an argument of the element() notation.

(Note that Prince version 8 has a 'flow' property that is similar to 'position: running()' and it does not have a property that corresponds to the flow concept from templates/regions.)

For comparison: XSL-FO has a “retrieve-marker” element, which is a similar to string() from css3-gcpm in that it retrieves a copy of the first or last tagged element, but it is similar to element() in that the copy has structure and can be styled.

Complex, but not special, content

In a few books, and often in glossy magazines, the running header is content that is copied from normal elements (i.e., not specially added to the document for the purpose of becoming a running header), but that is complex in nature: it has content that mixes multiple styles. For example:

One could create a copy of the element by transforming the document (with XSLT) prior to rendering and then using the ideas from the previous section to put one of the two copies in the running header.

Css3-gcpm also mentions the possibility of inventing a CSS syntax for making a copy (limited to the purpose of running headers), but there are no proposals yet for how CSS could style the copy and the original in that case. One could imagine a pseudo-class (':original'):

/* Define the running header: */
@page :right {
@top-right { content: element(chapter) " - " element(subject) }
}
/* The ",normal" means a copy is made that remains in the flow: */
h1 { position: running(chapter), normal }
h2 { position: running(subject), normal }
/* Style the copy in the running header: */
h1 { font: medium serif; color: black }
/* Style the copy in the flow: */
h1:original { font: large fantasy; color: red }

Or consider “chapter” and “subject” to be not just labels, but also a special kind of region, in which case region-based styling might be used:

h1 { font: large fantasy; color: red }  /* Applies to both copies */
@region chapter {
h1 { font: medium serif; color: black } /* Override in running header */
}

Running headers in unusual positions

In some glossy magazines it may happen that the running header isn't actually at the top. E.g., there is some text at the top, then a colored band with the page number, and then the text continues. Or the page number is in a box on the side with the text wrapping around it, as if it were a float.

Proposals exist that use page-based grid templates (see below) and then designate some of the regions in that template for running headers. That is easily done with the 'content' property, which already applies to other regions (in particular ::before/::after and margin boxes):

@page {
  grid: "top" 5em
        "running-header" 2em
        "bottom";
  chains: top bottom;
}
@page ::slot(running-header) {
  content: counter(page);
  border: solid;
}

This assumes that page-based templates can be styled, see below.

Another proposal is to allow the designer to create additional margin boxes with arbitrary names and use, e.g., some combination of 'top', 'left', 'bottom', 'right', 'width' and 'height to position them:

@page :right {
  @slot { content: string(chap); top: 0; left: 0 }
  @slot { content: counter(page); bottom: 0; right: 0 }
}
Page number integrated into the decoration of a chapter title. (Left the whole page, right an enlargement.)
A scanned page from a magazine with the page number inserted below the chapter title.

Colored patches along the page edge

Some books have colored patches bleeding to the outer edge of the page, at a different height for each chapter, so you can see where each chapter starts and ends even without opening the book. The patches can also be black, with only their position changing with each chapter. Sometimes there is text inside those patches.

The difference with running headers is that the page edge may display more than one patch if there are two or more chapters on that page, while the running header only displays the first (or last) chapter title.

It might be enough to allow an extra keyword in the 'string()' and 'element()' notations described above: not just 'start', 'first' and 'last' (which select exactly one of the marked elements on the page), but also 'all', to concatenate all elements.

E.g., the following puts the patches in the '@right-top' and '@left-top' margin boxes. Each H2 defines a black patch with a section number and each H2 puts it a different height:

@page :right {
  @right-top {content: element(patch, all); position: relative}
}
@page :left {
  @left-top {content: element(patch, all); position: relative}
}
...
h2::after {content: counter(h2); background: black; color: white;
  position: absolute; left: 0; width: 100%}
h2:nth-of-type(1) {top: 0em}
h2:nth-of-type(2) {top: 1em}
h2:nth-of-type(3) {top: 2em}
...

Footnotes, end notes and marginal notes

There are four aspects:

  1. Where does the footnote text come from
  2. Inserting the footnote numbers or markers
  3. Incrementing the footnote numbers
  4. Positioning the footnotes

Footnotes from inline elements

A footnote can be encoded as an element right after the word it is a footnote to:

<P>This is a word. <SPAN CLASS=note>See the
discussion in the next chapter.</span>

To take the note out of the flow, the usual choices are the 'display', 'float' and 'position' properties, although we might want to reserve 'display' for making footnotes block or inline. css3-gcpm proposes to use 'float: footnote'.

span.note {float: footnote}

In simple cases (continuously incrementing decimal numbers, footnote at the bottom of the page), this single declaration is all that is required.

The above mark-up does not clearly express what the SPAN is an annotation to. A link (see below) is more explicit, or the mark-up might be like this:

<P>This is a <RUBY><RB>word.</RB> <RT>See the
discussion in the next chapter.</RT></RUBY>

But for the CSS that probably makes no difference:

rt {float: footnote}

Although 'float: footnote' is in some ways similar to other types of float (especially to 'float: bottom', see below), it also has differences: the floated element does not necessarily turn into a block, it leaves behind a '::footnote-call' and it gets a '::footnote-marker' (or a '::marker'?).

Footnotes from an attribute

Footnote text may be in an attribute, such as the TITLE attribute in HTML. Probably nothing special is needed, for this case. If you can make an element into a footnote, you can also make a '::after' pseudo-element into a footnote:

span[title]::after {content: attr(title); float: footnote}

Footnotes from linked elements

A footnote is a kind of link, so it makes sense to model it in HTML as a hyperlink. The mark-up of a document might be like this:

<P>The trees are <a href="#color">blue</a>
in the summer.
...
<P ID=color>The colors (blue, red and orange)
are caused by different kinds of insects.

css3-gcpm proposes a 'target-pull() functional notation:

A::after {content: target-pull(attr(href url)); float: footnote}

Obviously, an element cannot “pull” itself or an ancestor. Certain combinations of properties may also not work.

Footnote text from another document

The text of the footnote may be another document:

<P>The fruit is a kind of <a href="apple">apple<a>

Should the target-pull() work here?

The footnote text might also be a fragment of another document:

<P>This is an <a href="fruit#apple">apple<a>

Multiple footnote areas

In a document with multiple columns, each column might have its own footnote area. In which case each footnote area can be as high as it needs to be,

Text in the first | Text in the       | The third column
column(1) with a  | second column,    | has(1) two(2)
single footnote   | which has no      | footnotes.
below it.         | footnotes and     | -----
-----             | thus needs no     | 1) This is the
1) This is the    | footnote area     | first.
footnote          | below it.         | 2) The second.

or they can all be forced to be as tall as the tallest of them (which may require multiple passes to balance the columns).

If the document, or part of it, is laid out as several independent flows in different regions (by means of a grid-based template or other), each of those flows can have it own footnote area. And if each flow occupies a chain of connected regions, there might be either a footnote area in each region (as if each region is a mini-page), or a single footnote area for the whole flow.

Multiple kinds of footnotes

In academic publication, such as critical editions, there may be two or more kinds of different footnotes, e.g., bibliographic references numbered with decimal numbers, alternative spellings numbered with letters, and explanations numbered with roman numerals.

Numbering different kinds of footnotes with different counters is not so difficult. css3-gcpm proposes that any counter can be used and that the style can be set with pseudo-elements '::footnote-call' and '::footnote-marker'. The UA style sheet already contains one such counter, footnote, and sets the style to super-decimal. E.g., to use two different kinds of footnotes:

.note { float: footnote; counter-increment: note }
.note::footnote-call { content: counter(note, lower-latin) }
.note::footnote-marker { content: counter(note, lower-latin) }
.bib { float: footnote; counter-increment: bib }
.bib::footnote-call { content: "[" counter(note, decimal) "]" }
.bib::footnote-marker { content: counter(note, decimal) }

However, these footnotes are all interleaved in the same footnote area. Positioning them in separate areas might require a user-defined page template, because there is only one predefined @footnote area. E.g.:

@page {
grid: "*"
"notes"
"refs"
}
.note { float: notes }
.bib { float: refs }

The 'flow' property from grids/regions as it is defined there is probably not usable, because it doesn't ensure that the footnote is on the same or a later page as the text it belongs to. Also, 'flow' doesn't generate '::footnote-call' and '::footnote-marker'' pseudo-classes. (Maybe it is possible to define that all floats generate '::footnote-call' and '::footnote-marker'', which just happen to be empty by default. But that seems difficult: we'd like them to not be empty for 'float: footnote')

Maybe it is possible to use 'flow' anyway, and have some other property that says that two flows are synchronized:

@page {
grid: "*"
"notes"
"refs"
synchronized: * notes refs
}
.note { flow: notes }
.bib { flow: refs }
.note::footnote-call { content: counter(note, lower-latin) }
.note::footnote-marker { content: counter(note, lower-latin) }

Footnote rules

In many (older) books, the footnotes are separated from the main text by a short rule. There are two challenges:

  1. the rule is only few em wide, and
  2. it is only there if there are footnotes.

Css3-gcpm says the footnote area isn't drawn if it is empty (no footnotes floated into it), and thus you can add a footnote rule as a simple border:

@page {
@footnote {
margin-top: 0.5em;
border-top: thin solid;
padding-top: 0.5em } }

If there are multiple kinds of footnotes and/or they are positioned with regions (see above), then probably those regions aren't omitted automatically and some other rules are needed to remove the footnote rule. Css3-layout suggests that empty slots match the '::blank()' pseudo-element, and thus you can “unstyle” an empty footnote area:

@page {
grid: "*"
"notes"
"refs" }
@page ::slot(notes) {
margin-top: 0.5em;
border-top: thin solid;
padding-top: 0.5em }
@page ::blank(notes) {
margin-top: 0;
border-top: none;
padding-top: 0 }

(Css3-page has an ':empty' pseudo-element to select empty pages in order to style them, but a '::slot()' is a pseudo-element and you cannot apply a pseudo-class to a pseudo-element…)

To make the rule short, css4-background suggests a 'border-clip' property to cut the border into visible and invisible parts. In this case there is just one part:

@page {
@footnote {
margin-top: 0.5em;
border-top: thin solid;
padding-top: 0.5em;
border-clip: 4em } }

Inline and block footnotes

If the footnotes are short, you may want them rendered as inlines, otherwise as blocks. The use of 'float: footnote' suggests that the floated element is turned into a block, because that is what happens for 'float: left', 'float: top', etc., but footnotes are different. Css3-gcpm suggests a simple switch: 'display' on the @footnote area makes all footnotes into inlines:

@footnote { display: inline }

or into blocks:

@footnote { display: block }

I.e., the 'display' doesn't apply to the @footnote itself, but to all elements floated into it. (This is different from, e.g., the 'border' or 'columns' properties, which can also be set on @footnote, but work as usual.)

If you have a mixture of short and long footnotes, you might want to make the long ones into a block and combine several short ones on one line, maybe with 2 or 3 em of space between them.

You could let the UA decide what is long and what is short, and say something like

@footnote { display: mixed; footnote-space: 3em }

Or the author could mark-up the the two kinds, e.g., with a CLASS in HTML:

.note { float: footnote; display: inline; margin-right: 3em }
.note.long { display: block; margin-right: 0 }
@footnote { tetx-indent: 1em }

But the 'margin-right' should somehow be suppressed after the last short note…

Maximum size of footnote area

An author might want to set a maximum size on the footnote area (as a length or a percentage), to avoid that a page consists of only footnotes or has just on eor two non-footnote lines.

In css3-gcpm, 'max-height' applies to the @footnote area:

@page { @footnote { max-height: 10cm } }

If the footnotes are positioned with a grid template, the minmax() notation can be used:

@page { grid: "*" "notes" minmax(0, 10cm) }

Numbering notes per page

It is common to number footnotes from 1 on every page (while end notes are numbered throughout a document or a chapter). If symbols *, †, ‡, etc. are used, then the first footnote on every page gets a *. Is it possible to use 'counter-reset' in an @page-rule?

@page { counter-reset: footnote }

There are several aspects that make numbering footnotes hard:

Currently there are no proposed properties for CSS that make such collapsing of footnotes possible, nor a way to restart numbering in each column.

White space issues

The footnote call should be placed after the word it applies to, but in the source document there is probably white space between the word and its annotation. The word might not even be marked-up as a separate element. Css3-gcpm calls this “footnote magic:” the footnote call is placed right after the previous text or replaced elements in the same block, ignoring any inline white space. (Something specific should happen if there is no such text or element.)

Css-text-4 proposes a 'text-space-trim' property to suppress unwanted white space just before or after tags. In this case the rule would be 'text-space-trim: discard-before'.

Marginal notes

Marginal notes are positioned as floats, maybe they are floats, but they also leave behind footnote calls in the main text. Probably easiest is thus to use another keyword on float:

span.note {float: left-note}
span.note {float: outside-note}

'Left-note' is like 'left', but in addition makes that the element has a '::footnote-call' pseudo-element. Similarly, 'outside-note' is like 'outside', but also creates a '::footnote-call'.

Page templates

When the built-in page template of CSS (16 margin boxes around the page body) is not enough, we'll need user-defined page templates. E.g., for complex, multi-level running headers and for page bodies with a grid-bsed template. The latter may occur, e.g., in photo books, and in most non-scientific magazines.

It seems reasonable to re-use the grid templates for that, but attached to a page rather than an element. The section on “Running headers in unusual positions” already showed how they might be used for running headers and “Multiple kinds of footnotes” showed an example for multiple footnote areas.

The same grid templates can also be used for the page body. The difference between the different usages is only in how the areas are filled with content:

Here is an example with all three kinds of regions: a and b are meant for running headers and are filled with, respectively the first English H1 and the first French H1 on the page. c and d are for body content and are filled with the 'flow' property. e and f are for footnotes and are filled with the 'float' property.

@page {
grid: "a a a"  1.2em
      "b b b"  1.2em
      ". . ."  1em
      "c . d"
      "e . f"  minmax(0, 10em)
}
::slot(a) {content: string(english-title); font-weight: bold}
::slot(b) {content: string(french-title); font-style: italic}
h1:lang(en) {string-set: english-title contents}
h1:lang(fr) {string-set: french-title contents}
[lang=en] {flow: c}
[lang=fr] {flow: d
.note:lang(en) {float: e}
.note:lang(fr) {float: f}

The use of 'string()' and 'string-set' allows the running headers to be in sync with the main content and repeated. The use of 'float' allows the footnotes to be in sync with the main content without being repeated. And the use of 'flow' means that the contents of c and d are not synchronized in any way: one of the two flows may be much longer than the other, so that the final pages have an empty slot for the shorter flow.

Named page templates

A complex book or magazine typically has not one, but several page templates. A certain kind of content may require a certain kind of page layout: the short-items section has a 4-column based layout, the essays section has one column and a wide margin, the photos section has a 2×2 layout, etc.

Css3-page allows names after @page and defines a 'page' property, which, when set on an element, forces that element to start a new page of the given name, unless it already is on such a page.

Sequences of page templates

Often, page templates occur in a certain order. The most common example is: the first page of a chapter is followed by a left page, then a right page, then a left page again, etc. This particular case can be handled with named pages, the 'page' property and the ':left' and ':right' pages from css3-page.

But other sequences cannot. E.g., an H1.news may trigger a news page template, which must then be re-used until another H1 triggers a different page template.

This could be handled by specifying for each page template what the default tenplate for the next page is:

@page news {
... /* style for news pages */
next: news }
@page photos {
... /* style for photo pages */
next: photos }

H1.news {page: news}
H1.photos {page: photos}

Empty pages

When a page is empty of content (e.g., if the last page of a chapter is on a right page and the next chapter only starts on the next right page) you may want to give it different running headers and footers, or add the text “This page intentionally left blank” to it.

Css3=page proposes the page selector ':blank' for that. Page selectors can be combined, so yo can also distinguish left blank pages ('@page :left:blank') from right ones ('@page :right:blank').

Cross-references

Page references

Css3-gcpm proposes a notation for cross-references, such as “See page 7,” where the page number can only be known at rendering time:

a::after {content: " (page\A0" target-counter(attr(href url), page) ")"}

to be used with mark-up such as:

... text <a href="#t1">anchor</a>...

... <h3 id=t1>Heading</h2>

A note on the syntax: The nested attr() is probably redundant, it could just be 'target-counter(attr, page)'.

When referring to a page, the page number should be shown the same way it appears on the page, Thus page XVII should be referred to as page XVII and not as page 17. Can this be automatic? What if the referenced page has no running heades and footers and tus doesn't show its page number?

Page numbers can consist of two counters, e.g., pages my be numbered as “6-21” for chapter 6 page 21. In that case the generated content should refers to two counters:

content: " (page\A0" target-counter(href, chapno)
  "-" target-counter(href, page) ")"

Fancy page references

The indication “see X on page 7” may not be the right one if it occurs on page 7 itself. One would rather see ”on this page” or ”below” in that case.

Css3-gcpm proposes a pseudo-class that matches if an element is on the same page as the element it links to, and two variants for if it is on the previous or next page:

a:target-layout(attr(href url) same-page)::after {
  content: " on this page" }

(The selectors4 draft proposes a notation to select the element pointed to by another, by means of a URL or an IDREF. Another proposal in selectors4 allows to reverse the selection and thus select the element that points to another. The target of the A element above is selected with 'a/href/ *'. And an A element that points to something is selected with 'A!/href/ *' Maybe building on this notation can make the same-page pseudo-class easier. E.g., a!/href *:same-page, a!/href/ *:previous:page and a!/href/ *:next-page)

A similar case occurs when referring to inside/outside floats. Depending on whether the page is a left or a right page, the float may be ”see on the left” or ”see on the right.” This could maybe also be a pseudo-class.

On some media, the element might not float at all, e.g., because the viewport is too narrow or because th eoutput is speech. In that case “on the left” might become “above” or “earlier.” See also Alternative content.

References to the content of elements

Css3-gcpm also proposes a way to copy the content of the target element. E.g., for mark-up like this:

<p>See <a href="#chx">this chapter</a> for an in-depth evaluation.
...
<h2 id="chx">A better way</h2>

one could use a style sheet like this:

a { content: '“'  target-text(attr(href url), content-element) '”'

This is most useful when referring to generated content, such as the content of '::before' (with the 'content-before' keyword), because the author typically knows the content of the element already, but not the generated content, which could be the value of a counter.

However, any structure in the target element is lost, because only the text is copied. Which could be a problem if the target element contained mark-up for bidi or a bit of math. Maybe a 'target-element() notation combined with region-based styling can help:

a { content: "“" target-element(attr(href url), content-element) "”" }
@region a {
 h1 { font: inherit }
}

Assuming the href points to an H1 element, this causes a copy of that H1 element to be put in the A element, replacing its content. The A element is then treated as a region and the last style rule sets the font of H1 elements that are pulled into A elements.

Continuation marks

When a text is broken into several regions, it may be necessary to insert something at the end of a region to indicate that the text continues elsewhere, possibly with the page number where the text continues: “Continued on page 7.”

If the text continues elsewhere but on the same page, you may want to suppress that marker, or replace the “page 7” with some other phrase. (See also “Fancy page references” above.)

There is a proposed 'text-overflow' property in css3-ui that can insert a fixed text, but it cannot currently insert a page number.

Referring to the end value of a counter

Text such as “page 7 of 20” refers to the counter “page” twice, once for a particular occurence and once for the last occurrence. Css3-page defines a special counter 'pages' for the latter, but user-defined counters do not have such a companion. In other words, to refer to “item 7 of 20” you need some hacks…

There could be a special function that refers to the highest value of a counter.

Ordinal numbers

Instead of referring to “item 7” or ”page III,” you might want to refer to “the 7th item” or “the 3rd page”:

content: "the " target-counter(href, ordinal) " item"

But this requires quite a bit of linguistic knowledge in the renderer. It needs to know which language to generate. Depending on the language, the generated form may also depend on the context, e.g., a masculine or feminine form.

Equation numbers

The term “equation number” indicates their primary use: to label mathematical formulas. But they are also used to label other displayed material, such as grammar rules, example phrases in linguistics or chemical formulas. They may differ in how they are aligned, but in terms of referencing them, they are not much different from section numbers, page numbers, or list numbers: you should be able to refer to them the same way, with something like “see equation 17.2 above.”

Aligning to the bottom of a column

In magazines with two or more columns of text, the designer often tries to arrange things so that all columns and exactly at the bottom. That may require re-cropping an image (which CSS cannot easily do), but also inserting some flexible space above the last paragraph.

There seem to be a few different cases:

Page floats

The concept of “floating” in CSS is used for moving content such that it is no longer inline, but still as near as possible to where it originated. The model for footnotes in css3-gcpm therefore uses the 'float' property. But in paged media, there are other distinguished location that content can be floated to.

Inserts at the top or bottom of a page or column

Floats can go to the top or bottom of this page, if there is room, otherwise to the top or bottom of the next available page, see css3-gcpm.

figure { float: top }

If the element occurs inside a column or inside a region (a slot of a grid template), the element can either go to the top of that column or region or to the top of the page. (If it is a column inside a region, it might go to three different places: the top of the column, of the region or the page.) There is no proposed syntax yet to express this choice.

If there are multiple bottom floats on a page, they should probably be stacked with the oldest above, i.e., with the visual order corresponding to the order in the document source. (This is different from what happens with right floats). But this needs investigation.

The choice of top or bottom might also be made at rendering time based on which has room, which is nearer, or which preserves the order of floats:

figure { float: snap }

The choice might also be between keeping the element in the flow and floating it to the top or bottom:

figure { float: here }

This means the element is rendered as a block right where it falls in the flow, unless that would cause a page break before it, because there is not enough space left on the page. In that case it automatically turns into a top float. (Typically, the element would also have a 'break-inside: avoid'.)

Float to inside or outside margins

If there are left and right pages, an element can float to the inside or outside margin, i.e., left or right depending on the page it falls on. See css3-gcpm.

figure { float: outside }

Floats in the middle of the text

Floats could also be in the middle (of a column, of a page, or at a specific offset from the side), with text flowing on both sides. This is in general not good for readability (it's difficult to find the correct continuation of a line on the right side of an image), but it has been used for special effects, e.g., in children's books (of the kind that an adult reads out loud while the child watches the images). On the other hand, there is no readability problem if the float is in the middle between two columns, intruding on both of them for half of its width.

With a grid-based template, you can reserve space at an exact vertical position (assuming L-shaped regions), such as the exact center of a page:

body { grid: "* * *" *
             "* x *" 5em
             "* * *" * }
img#p1 { flow: x;
         width: 100%;
         height: 100%;
         object-fit: cover }

With a property like 'float-offset' from css3-gcpm, you can instead position a float near the text that refers to it (subject to other floats that might already occupy the space):

img#p1 {
  float: left;
  float-offset: 50%; /* = center, as in background-position */
  clear-side: none;  /* = wrap-flow: both */
}

The name 'clear-side' is proposed by css3-gcpm to define which side of a float does not get text. The name 'wrap-flow' is proposed by css3-exclusions to define which side of a float text does get text. The set of values could be:

Value for 'clear-side' Value for 'wrap-flow' Meaning
auto auto Depends on type of float (level-2 behavior)
left right Only wrap text around the right side
right left Only wrap text around the left side
top bottom In vertical text, only wrap text around the top side; otherwise same as auto.
bottom top In vertical text, only wrap text around the bottom side; otherwise same as auto.
start end Left or right, depending on 'direction' of containing block
end start Left or right, depending on 'direction' of containing block
none both Text flows on both sides
both none No text on either side
minimum maximum Text on the side with the most space (not yet well defined)
maximum minimum Text on the side with the least space (not yet well defined)

Page spreads

A left and a right page treated almost as a single double-width page. This can be used, e.g., in a magazine to have an extra-large title above the article, or in a book to show a particularly large table or illustration.

An extra difficulty is that we want headings that span the whole spread to appear as if they are a single line, but in reality the words are positioned so that the gutter falls exactly between two words. If it is a table that is spread over the two pages, we would similarly want the break to be between columns and not in the middle of one.

Generated content

CSS level 2 has small-scale generated content: 'before' and 'after' pseudo-elements can have a 'content' property. Level 3 adds the content property to the margin boxes of the predefined page template and also proposes to use the same property to replace an element's content by generated content and to create generated content in regions (slots) of a grid-based template.

All those uses of the 'content' property generate flat text in a single style, or use elements that are moved from elsewhere (with 'position: running()' from css3-gcpm for running headers and 'flow' from css3-layout/css3-regions for text flows). They do not generate structured text, such as tables of contents, bibliographies or alphabetical indexes.

Tables of contents, figures, tables, etc.

Various kinds of tables can be generated with a pre-processor (such as XSLT), leaving just the references to counters (section numbers, page numners) to be generated at rendering time.

Alphabetical index

A basic alphabetic index can be made by a preprocessor (except that, if page numbers are needed, those have to be added at rendering time).

But it is tradition that multiple occurrences on the same page are collapsed, and occurrences on a sequence of pages are replaced by a range. I.e., instead of

Oak, 7, 7, 10, 21, 22, 22, 23

The index should read

Oak, 7, 10, 21–23

Some authors may choose to coalesce a range of pages subject to how the term is used: two independent mentions in passing (no coalescing), or two mentions that are part of the same story (coalescing). This is obviously something a human should indicate by using different kinds of mark-up for the different occurrences.

Collapsing sequences to ranges can only be done after the page numbers are known, i.e., by the formatter, and not by a pre-processor.

The index is normally at the end, so optimizing the index in this way is unlikely to cause page numbers to change, but in theory this process could require two or more iterations, until the page numbers stabilize or it is clear that they never will.

There are currently no proposals for how to generate an index with CSS only. It would probably involve a property to mark elements for the index (similar to how 'string-set' marks elements for running headers), maybe a keyword 'index' for the 'content' property to insert an index at the appropriate place, and some properties and pseudo-elements to specify what the index looks like (::term, ::subterm, separator, whether to collapse page numbers, etc.)

Bibliography

Like tables of contents, bibliographies can be generated be a preprocessor. (If the citations are in footnotes, rather than at the end of the document, the footnote markers would still be generated at rendering time.)

Replacing glyphs

In some cases it may be useful to change the appearance of text not just by changing the font, but by substituting different letters altogether. E.g., it may be required by a style guide on a certain platform that the ellipses are rendered as three dots instead of a Unicode ellipsis character. Or that quote marks are straight (") instead of curved pairs (“”). This requires a way to select text and replace it.

Options include a new text selector (a new kind of pseudo-element) combined with the already existing 'content' property, or a property that specifies a search and replace operation:

p:text("…"), p *:text("…") { content: ". . ." }  /* pseudo-element */
p { text-replace: "…" / ". . ." }                /* property */

There may, of course, be multiple replacements that apply to a single element and the replacements are typically inherited by child elements.

The special case of quote marks can also be handled with the 'quotes' property, but only if the source document doesn't already have quotes of its own.

Page breaks and line breaks

When HTML and CSS are used for user interfaces, speed is important. But CSS is designed for typography, which is a constraint-based optimization problem over a discrete search space. In other words, it sometimes takes multiple iterations. We will need a way to limit the time spent on optimization, maybe as a single property, maybe as several properties for the different factors that may influence the speed: line breaking (homogeneous white space throughout a paragraph, few hyphenated lines, no rivers of white, alternative content…)

E.g., Liam suggested a property to select among several line breaking algorithms.

For alternative content, see below.

Hyphenation

Css3-text has properties for hyphenation and for line breaking control in scripts without white space. However, it does not have properties for controlling the desired quality, e.g., how hard to try to avoid a hyphen, avoid hyphens on consecutive lines or avoid a hyphen on the last line of a page (or only on the last line of a recto page).

Also, under the draft of css3-text, the hyphenation dictionary and rules are supposed to be provided by the renderer. There is no way for an author or designer to substitute a different dictionary, provide a fallback dictionary if the renderer doesn't have any, or even provide overrides for certain words (other than by adding soft hyphens directly in the document).

For an athor to be able to provide such a dictionary, there would first have to be a standard format for it, which currently doesn't seem to exist. (The formats used by various programs, such as TEX or LibreOffice might be starting points.)

Note that such a dictionary would be more than just a list of (partial) words or syllables: some hyphenation rules depend on context, e.g., the English words “record” and “present” break differently based on whether they are used as noun (rec-ord, pres-ent) or as verb (re-cord, pre-sent).

Thematic breaks (vertical margins) falling at page breaks

[This use case is derived from a case described by Dave Cramer in “Pagination.”]

Sections of text may be separated from each other by nothing more than extra white space, i.e., without a section title. Such a separation is sometimes called a “thematic break.” In CSS, that is typically done with an extra large margin.

But if a CSS margin falls at a page break, it is removed. That is because you typically want the normal margins between paragraphs to disappear and the text to align to the bottom and top of the page. But it means it is no longer visible that there was a thematic break there.

You can use padding instead of margin. That won't disappear. But then you have pages of uneven length, which doesn't look nice, and it is still not very clear that there is a break between sections.

If there are many such thematic breaks and thus a big risk that some of them fall at a page break, the designer may choose to replace the white space by some ornament, such as three centered asterisks or a little flower.

Or the designer may choose to stay with white space and use a visible separator only for those breaks that fall at a page break. Then the problem becomes how to express that rule in the style sheet.

This may be a variant of alternative content, although the reason to choose the ornament over the space isn't purely copyfitting.

The section, or the first paragraph of the section, would have a normal style of, say, 'section {margin-top: 2.4em}' and an alternative style with generated text: 'section::before {display: block; margin: 0 auto 1.2em auto; content: "* * *"}' and then some mechanism, maybe a pseudo-class, to select between them:

section:not(:first-on-page) {margin-top: 2.4em}
section:first-on-page::before {display: block;
  margin: 0 auto 1.2em auto; content: "* * *"}

Or maybe this is more flexible than is needed. If the choice is only between putting an ornament or not, without changing anything else about the element, and only very limited control over the style of the ornament itself, then a simple property or two may be enough:

section {break-ornament: "* * *"; break-ornament-align: center}

This puts the given text inside the top margin, at the very top, and also has the effect that that top margin does not disappear at a page break. (The top margin better be big enough for the ornament, or it risks overlapping the text of the section.)

Possibly the ornament could have its own fonts and color as well.

Page length of facing pages

Normally we want all pages to be the same length (apart from the short page at the end of a chapter) and on double-sided printed pages we even want the same number of lines and the lines on both sides to line up, because so that the space between the lines is as white as possible, with no lines from the other side of the paper shining through.

But if that leads to bad page breaks, such as widows and orphans, we may need to make a page shorter or longer anyway. In that case we may decide that is is OK if the page is shorter than the page printed on the back of the paper, but we still want every two facing page to have the same length.

Clearly, these goals may conflict: avoiding an orphan on a left page by making that page and the next one line shorter may cause an orphan to appear on that next page. If there is no other flexibility in those pages or in previous pages, then the right page probably has to be made one line longer again…

Leaders and tabs

Leaders

CSS has simple leaders, enough to push an element to the end of the line (or the end of the next line, if it doesn't fit) and filling up the space with dots, spaces, or any other string. But that is not enough to make two or more columns at the end of the line, e.g., the currency and the amount in the following example. (Note that it cannot be done with a table.)

Coffee          USD    2.00
Tea             USD    1.75
Train           EUR   67.50
Hotel (including Berlin
and Paris)    EUR  450.00

Align on decimal points

Simple leaders also do not allow aligning on a decimal point, instead of on the right edge of an element.

An old proposal in css3-tables for tab-like alignments was abandoned in favor of the simple leaders, but could be revived, if we think CSS should have this feature. Here are some examples:

signature { tab: 100% right }      /* right-aligned at end of line */
amount { tab: -1em "." }           /* aligned on dot, 1em from end of line */
desc { tab: 0 left }               /* left-aligned to start of line */
pageno { tab: 100% right / " . " } /* right-aligned, with dot leaders */
col2 { tab: 50% center / " · " }   /* centered in the line */

With a document fragment like the following:

<expenses>
<desc>Hotel</desc> <amount>374.55</amount>
<desc>Travel</desc> <amount>1460.10</amount>
<desc>Miscellaneous, including presents and
tips</desc> <amount>84</amount>
<desc>Total</desc> <amount>1918.65</amount>
<signature>Ph. Fogg</signature>
</expenses>

The rendering might be as follows:

Hotel . . . . . . . . . . . . 375.55
Travel  . . . . . . . . . .  1460.10
Miscellaneous, including presents
and tips  . . . . . . . . . .  84
Total . . . . . . . . . . .  1918.65
                            Ph. Fogg

Align on decimal points in table columns

CSS level 2 used to have a way to align cell contents in a column on decimal points (or any other string of characters), by means of the 'text-align' property: 'td {text-align: "."}' It is useful to have the alignment both for tables and for tabs, because of the different behavior between tables and tabs with respect to line wrapping (see above).

The next version of css3-text is likely to have an updated version of 'text-align' for table cells.

Note that setting the alignment in table cells may influence the size of the cell: two equally long numbers, one with the decimal point at the start and one with the decimal point at the end, will cause the column to become almost twice as wide as when both numbers are centered or aligned to one side. This can be an advantage over aligning with tabs (the position of the decimal point is chosen automatically) but also a disadvantage (if there is no room for a sufficently wide column, some cells will not align).

Printing marks

See marks & bleed in css3-gcpm.

Bookmarks for PDF

A kind of metadata that is not displayed on the canvas, but for which the CSS syntax is convenient. CSs3-gcpm has a proposed syntax:

h1 {bookmark-level: 1}

An additional property can change the text of the bookmark. By default, it is the text of the element.

As with other uses of the text (string(), target-text()), the structure of the element is lost. Elements with bidi mark-up or mathematical formulas may not look correct in the bookmarks menu.

Styling blank pages

See css3-gcpm.

Page sequence direction in interactive media

An interactive paged display may simulate pages that are stacked (as in a book), or side by side (slide sideways) or one above the other (slide up). The designer may want to express a preference.

Css3-gcpm proposes 'overflow-style: paged-x' for slide sideways and 'paged-y' for slide up.

We may even want combinations: the next chapter is below the current one, the next page is on the right, and the next book is stacked below this one…

The designer may even want to specify an animation (transition affect) when changing pages.

Float margins

Floats have margins, but if you float something to the 'inside' or to the bottom-or-top of a page, you may not know which margin to set.

There could be a pseudo-class (img:left {margin-right: 1em}) or just a property for the margin, because the margin is what you typically want to change:

img {
float: inside;
margin-inside: 0;
margin-outside: 1em }

This margin is in addition to the margin-left and margin-right.

Page size

The size of the viewport is chosen by the user, not the designer, but in the case of printed material, they are often the same person, and so it is convenient to choose the paper size directly in CSS. Css3-page offers the 'size' property:

@page {size: a4}

Styling lines

Line selectors

The '::first-line' pseudo-element from level 2 allows to apply style to the first line of a paragraph, but in some magazines it isn't just the first line that is styled, but the first n lines. A generic line selector might work as follows:

P:nth-line(5n+5) {color: red} /* Every 5th line */
P:nth-line{-n+3) {font-weight: bold} /* First 3 lines */

There are no published proposals for such a feature so far.

Line numbers

A common case of line-based style is line numbers, e.g., in poetry or in computer program code. Could this be done with a pseudo-class (:nth-line()), a counter and a float? Or is it better to have a separate property, just line list numbering has its own property?

In poetry especially, the designer should be able to either count empty lines or skip them. (But maybe this is indirectly controlled by using either a margin between verses or an actual line with no text.)

Line numbers can either be continuous for an element, or, more rarely, start over at every page.

Copyfitting

Copyfitting is the process of selecting fonts and other parameters such that text fits a given space. This may range from making a book have a certain number of pages, to making a word fit a certain box.

Micro-adjustments

If a page has enough content, nicer-looking alignments and line breaks can often be achieved by “cheating” a little: instead of the specified line height, use a fraction of a point more or less. Instead of the normal letter sizes, make the letters a tiny bit wider or narrower…

This can also help in balancing columns: In a newspaper, e.g., it may look better to have all columns of an article the same height at the cost of a slightly bigger line height in the last column, than to have all lines aligned but with a gap below the last column.

The French newspaper “Le Canard enchainé” is an example of a publication that favors full columns over equal line heights.

Automatic selection of font size

One common case is choosing a font size such that a headline exactly fills the width of the page.

A variant is the case where each individual line of the text may be given a different font size, as small as possible above a certain minimum.

Two models suggested for CSS are to see copyfitting either as one of several algorithms available for justification, and thus as a possible value for 'text-justify'; or as a way to treat overflow, and thus as a possible value for 'overflow-style'. Both can be useful and they can co-exist:

H1 {text-align: justify; text-justify: copyfit}
H2 {height: 10em; overflow: hidden; overflow-style: copyfit}

The first rule could mean that in each line of the block, rather than shrinking or stretching the interword space to fill out the line, the font size of each letter is decreased or increased by a certain factor so that the line is exactly filled out. The latter could mean that the font size of all text in the block is decreased or increased by a common factor so that the font size is as large as possible without causing the text to overflow. (As the example shows, this type of copyfitting requires the block's width and height to be set.)

The title of the chapter is one word that exactly fills the width of the page.
A scanned page from a magazine.

Alternative content or style

If line breaks or page breaks turn out very bad, a designer may go back to the author and ask if he can't replace a word or change a sentence somewhere, or add or remove an image.

In CSS, we assume we cannot ask the author, but the author may have proposed alternatives in advance.

Alternatives can be in the style sheet (e.g., an alternative layout for some images) or in the source (e.g., alternative text for some sentence).

In the style sheet, those alternatives would be selected by some selector that only matches if that alternative is better by some measure than the first choice.

Some alternatives may be provided in the form of an algorithm instead of a set of fixed alternatives. E.g., in the case of alternative image content, the alternative may consist of progressively cropping and scaling the image up to a certain limit and in such a way that the most important content always remains visible.

E.g., an image of a group of people around two main characters can be divided into zones that are progressively less important: the room they are in, people's feet, the less imnportant people, up to just the heads of the two main characters, which should always be there.

Change bars

Change bars are a kind of style that usually does not correspond to an element, but starts at one element and ends at another. With the INS and DEL elements of HTML, an author may be forced to use multiple such elements to respect the element nesting structure, but other mark-up language often use empty elements to mark the start and end of a change, allowing the change to cross elements.

There are thus three problems for CSS:

The change bars should probably go just outside the content box of the enclosing block element. If the change bar crosses several block elements, they may thus not form a single continuous line. (And if one of those blocks has a vertical-writing mode…)

A property could indicate that an element represents the start or end of a change section. A starting element is then paired with an ending element in the same flow (ignoring nested pairs) and a pseudo-element can represent the whole extent.

mark[role=begin] {change-mark: start}  /* Hypothetical XML */
mark[role=end] {change-mark: end}
DEL, INS {change-mark: element}        /* HTML */
::change {
  change-bar-offset: 0.5em;
  change-bar-side: left;
  change-bar-width: medium;
  change-bar-style: solid;
  change-bar-color: black }

If a page contains one or more change bars, some types of publications also require that the running header contains a change bar or other mark.

Non-rectangular regions

Apart from the mechanism of floats, which cut out a rectangle from some other text, CSS level 2 has no way to create non-rectangular regions. A number of ways are under investigation for level 3.

Non-rectangular floating images

A relatively easy way (for authors and for implementers) to add new shapes is to allow a float to have an arbitrary contour. If the float is an image with transparent parts, it has an implicit contour already, that only needs to be activated:

IMG {float: left; shape-outside: contour}

Or an explicit shape can be assigned to the floating element: a circle, ellipse, rectangle, polygon, an external image used as bitmap mask, or an external SVG shape.

IMG {float: left; shape-outside: url(mask)}

A separate property, 'wrap-flow' determines if the text also fills any holes in the shape ('wrap-flow: both') or not ('wrap-flow: auto'). E.g., a V-shaped image with 'wrap-flow' both may lead to a rendering like this:

###### Lorem ipsum dolor sit amet, consectetaur
### adipisicing ### elit, sed do eiusmod tempor
## incididunt  ### ut labore et dolore magna
#### aliqua.  ### Ut enim ad minim veniam, quis
#### nostrud ### exercitation ullamco laboris nisi
############### ut aliquip ex ea commodo consequat.
############## Duis aute irure dolor in reprehenderit
############# in voluptate velit esse cillum dolore
eu fugiat nulla pariatur. Irure dolor in reprehend
incididunt.

while 'wrap-flow: auto' wraps only on the side away from the float (i.e., on the right for a left float):

###### Lorem ipsum dolor sit amet, consectetaur
###             ### adipisicing elit, sed do eiusmod
##             ### tempor incididunt ut labore et
####          ### dolore magna aliqua. Ut enim ad
####         ### minim veniam, quis nostrud
############### exercitation ullamco laboris nisi ut
############## aliquip ex ea commodo consequat. Duis
############# aute irure dolor in reprehenderit in
voluptate velit esse cillum dolore eu fugiat nulla
pariatur. Irure dolor in reprehend incididunt.

Non-rectangular floating text elements

If the floating element is not an image, but text rendered by CSS itself, it can be given a 'shape-outside' as well, of course.

But it could even have an implicit contour, based on the outline of the letters it contains. This is most useful with very large fonts, of course, e.g., to create a drop-cap effect.

span#dropcaps {
  font-size: 5em;
  float: left;
  wrap-flow: auto;  /* = default */
  shape-outside: contour
}

See css3-gcpm for illustrations.

L-shaped regions in a grid

When an element or a page has a grid-based template, it may be possible to create non-rectangular regions under certain conditions. E.g., slot A in the following template has a non-rectangular shape:

grid: "A A"
      "B A"
      "A A"
      "A C"
      "A A"

Regions with “holes” in them are also possible this way.

If the grid has a fixed size, such non-rectangular regions are computationally not harder than wrapping around floats. But if the grid is flexible (depends on how much content has to fit in it), this requires multiple iterations.

There exist algorithms for solving such constraint-based systems, and in practice they are fast enough for static displays, but they may not be fast enough if the layout is part of an animation. In that case, the iterations have to be limited by time and thus different UAs may give different results, depending on the algorithms they use and the speed of the underlying machine.

A page with several regions, some of which have horizontal text and some vertical. The width of the bottom left box seems to depend on the amount of text inside.
Scanned page from a Japanese magazine.

Non-rectangular columns

Css3-multicol currently doesn't allow floats to intrude on more than one column. That restriction may be lifted in level 4. That is needed to create a float that is centered between two columns. You could also do that with grid templates instead of columns, if you know beforehand that the float exists:

body {
  grid:  *  *  * 2em *  *  *
	"a  a  a  .  b  b  b"
	"a  a  a  .  b  b  b"
	"a  a  Z  Z  Z  b  b"
	"a  a  a  .  b  b  b"
	"a  a  a  .  b  b  b";
  chains: a b;
}
img#p1 {flow: Z}

A grid has a fixed number of columns, but could allow non-rectangular regions. The 'columns' property allows the number of columns to grow with the amount of content, but does not allow non-rectangular columns. However, if you regard the columns as creating an implicit grid and add a way to select a column, i.e., a pseudo-element '::column(n)' or '::region(n)', then you could give the columns properties, just like '::slot()' gives properties to regions in a grid.

In particular, you could give columns margins. A negative margin would then cause a column to partly overlap another and act as an exclusion. ('z-index' determines which is “on top.”)

The idea is in css3-gcpm, but it yet has to be investigated if it can work. Clearly, it requires multiple passes to determine the number of columns.

Shapes for elements

The length of each line in a block element is determined by the width of the element (and any floats that intrude on the element). The height of an element can also be set, which effectively defines a rectangle. Css3-exclusions proposes to generalize this rectangle to an arbitrary shape: a circle, ellipse, polygon or a shape defined by an external image.

An example of this would be to combine a background image and a shape, so that the element appears to flow around the background image. But the background image thus need not be an actual element, which is good if it is purely ornamental.

article {
  background: url(grass.jpg) bottom left no-repeat;
  shape-inside: url(grass-mask.svg);
}

However, unlike background images, shapes do not repeat in the current css3-exclusions, nor can there be multiple shapes.

If the background is an image (or several images) with transparent parts, its contour can be used directly, and in that case the shape is repeated:

article {
  background: url(grass.jpg) bottom left no-repeat;
  shape-inside: contour;
}

For more control over what is considered “transparent,” the 'shape-image-threshold' property specifies the maximum alpha value.

Positioned exclusions

Floats create exclusions when there is actual content to float while grids allow elements to be put in exclusions independent of the document order. Combining those (allow arbitrary order of elements and create an exclusion only when there is content for it) would be something like absolutely positioned floats:

IMG#p1 {position: absolute; top:...; left:...;
  wrap-flow: both}

The 'wrap-flow' says that the IMG is an exclusion for whatever else happens to be in that location (restricted to a particular “stacking context” or similar). Because the positioned element may occur in the source after the content it displaces, this may require multiple iterations of the layout.

Packaging and off-line use

The Web is based on a multitude of specialized formats with URLs as the glue. A book made with Web technology thus typically consists of multiple files. EPUB defines a packaging format for those files. In principle, CSS can use any URL and as long as there exist URLs for referring to the files in the package, CSS should just work. However, there are some closely related technologies that are not independent of URLs.

Embedded fonts

CSS allows for fonts to be embedded in the style sheet. In most cases, the font is actually transcluded, at least from the point of view of CSS, but some extra mechanism may bind the font to the document in case the copyright license on the fonts does not allow them to be re-published. Such a mechanism exists for the EOT and WOFF formats:

The EOT format stores in the font file the absolute URL (or some prefix of it) of the document for which the font may be used. The URL is omitted for fonts that may be re-published freely. An offline e-book does not have an absolute URL, which means fonts in the EOT format can only be used in an offline book if the font's license allows re-publishing.

The WOFF format stores that URL prefix in an HTTP header that is returned with the font when it is served by an HTTP server. If the font may be re-published freely, that can be indicated by using “*” instead of a URL prefix. But in the absence of that header, or if the font is not served by an HTTP server, the URL prefix is assumed to consist of the protocol, server and port of the server that served the font. It is undefined what that means for an offline book. Which means that fonts in the WOFF format, even free ones, cannot be used offline.

Free fonts in the WOFF format may thus have to be converted to OpenType prior to packaging.

Replacement fonts

When the license of a font does not allow embedding at all, it can only be used to produce paper books and the e-books will need a replacement font.

CSS allows authors to specify a list of fallback fonts. However, if none of those is available on the user's system, the user agent has no way of knowing what those fonts look like and thus cannot try to substitute a closely resembling font.

CSS has a PANOSE font descriptor for that purpose, but it is not yet copied to any of the level-3 modules and thus likely to become deprecated. It only allows PANOSE version 1, which is suitable for font designs for the Latin and Cyrillic script.

SVG can use CSS to specify fonts, but also allows fonts to be specified directly with attributes in SVG. In that latter case, an attribute allows to specify the PANOSE characteristics of the font (PANOSE version 1 only).

User annotations

The reader of a e-book may want to bookmark pages, scribble in the margin, or store (style) preferences with the book. This may involve some CSS.

User control of styling

Based on experience with current e-readers, publishers expect that users will more often want to select alternative styles or personal styles when they read e-books than when reading pages online. That can be as simple as preferring landscape over portrait mode, but it can go much further.

The CSS model allows for style sheets from the author, the reader and the user agent. It also requires that users be able to turn style sheets off and select among alternatives, if there are some. But it doesn't define a user interface.

Alternative layouts

A publication may come with several style sheets, to offer the user different choices or to accomodate different kinds of screens. On larger screens or wider paper, these layouts may be based on grids with different numbers of columns and rows and each layout may have a different visual order for the elements of the text: things that would be relegated to the end in a single-column design could occupy the first column if there is more horizontal space to work with.

This requires either the ability to integrate document transformations in the style sheet or a template mechanism that allows positioning content.

The former method could be based on XSLT: the document links to an XSLT style sheet instead of a CSS one, and it is the XSLT that in turn applies CSS to the transformed document. This would allow the use of CSS features such as Flexible Box Layout, which are meant for graphical user interfaces and thus require different element trees for different layouts:

<link href="n.xslt" rel="alternate stylesheet" title="One column">
<link href="m.xslt" rel="alternate stylesheet" title="Three columns">
<link href="w.xslt" rel="alternate stylesheet" title="Five columns">

A disadvantage of using XSLT is that it makes editing the style harder. (If the style is used for many documents, that is less of an issue.)

The second method requires the 'flow' property from css3-layout/css3-regions: The layout is described with a template, which is to a large extent independent of the document tree, and then the elements are “flowed” into the slots of the template:

<link href="n.css" rel="alternate stylesheet" title="One column">
<link href="m.css" rel="alternate stylesheet" title="Three columns">
<link href="w.css" rel="alternate stylesheet" title="Five columns">

where the m.css could contain, e.g.:

body {grid:  * 1em * 1em *
            "a  .  b  .  c"  fit-content
            ".  .  .  .  ."  1em
            "d  d  d  d  d"  fit-content }
.menu, .nav {flow: a}
.main {flow: b}
.notes {flow: c}
.signature, .endmatter {flow: d}

The 'grid' property sets up a template, which in this case has five columns (three of equal width and two of 1em wide each), three rows (two of flexible height and one of 1em high) and four slots for content (a, b, c and d). The 'flow' property then distributes the elements over the slots, in this case based on their class attributes. The w.css style sheet would have a larger grid, possibly with more slots, and the contents would be distributed differently.

Drop caps

CSS level 2 has drop caps, but provides no properties that help with aligning them. Typically, the baseline of the drop cap should be aligned with the baseline of a text line.

If the drop cap is very big, the designer may want the text lines to wrap around the actual shape of the letter, instead of around a bounding box. The drop cap would act somewhat like a shaped float in that case.

A drop-cap with the baseline aligned to the fourth text line and the top of the letter aligned to the ascenders of the first text line. (Left the whole page, right an enlargement.)
A scanned page from a magazine with a drop-cap “A” over four lines.

Vertical text

Some languages are always written vertically, others, such as Japanese, can be written horizontally or vertically, but are more often written vertically in paged media.

The css3-writing-modes module proposes properties for switching between vertical and horizontal and for the text effects that only occur in vertical, such as rotated letters and combining narrow horizontal letters into a single letter-like box (“tate-chu-yoko”).

But other modules are affected, too. Vertical text changes the interpretation of some properties, e.g.: 'line-height' is interpreted effectively as a line width; 'text-align' acquires needs new 'top' and 'bottom' keywords; 'direction: rtl' for Hebrew or Arabic inside vertical text is interpreted to mean bottom-to-top. Others are unchanged, e.g.: 'margin-left' is still on the left, the '@top-left' box for running headers is still on the top left, '@page :left' still selects the left-hand page.

Vertical text, such as for Japanese, has been worked on in CSS since early 1999, i.e., already during the development of CSS level 2. But the proposal wasn't satisfactory. A new model was tried in 2001, after looking at XSL's model. It even reached Candidate Recommendation status in 2003 (and was implemented by Microsoft), before it, too, was found to be insufficient. Since 2010, the CSS WG is working on its third attempt.

Hanging punctuation

Properties for hanging punctuation (the effect that small punctuation marks, when they happen to occur at the start or end of a line, are placed in the margin outside the line box) are proposed in css3-text.

Logical mark-up vs typographical mark-up

Typographical traditions often give the punctuation that follows a phrase the same style (bold, italic) as that phrase, even though logically it doesn't belong to the phrase. Sometimes, e.g., in American English, punctuation is even put inside a quoted phrase. For translating, speech synthesis or text analysis it would be nice if the mark-up closely followed the semantics, but then CSS needs more features to style text that isn't an element.

Logical vs typographical punctuation

When marking up text independent of the style, the “logical” way to mark-up inline phrases is to exclude punctuation that is not strictly part of the phrase. E.g., the comma and period are outside the A elements in this fragment:

... trees such as <a>oak</a>,
<a>pine</a> and <a>willow</a>.

But when applying style to such phrases, the punctuation is typically included:

… trees such as oak, pine and willow. (bold)

… trees such as oak, pine and willow. (italics)

… trees such as oak, pine and willow. (color)

This could maybe be fixed automatically by the formatter with heuristic rules, and a property to turn those rules off in difficult cases ('punctuation-styling: auto' (default) vs 'none'). The heuristic rule is that punctuation is included in the style of the preceding or following text (whether source text or generated) if there is no padding, border and margin between that text and the punctuation.

This doesn't apply to all punctuation: It includes commas, period, colons, semicolons, exclamation marks, question marks, apostrophs, etc. (maybe all of Unicode category P0). But it does not apply to parentheses, brackets, and similar (categories Ps & Pe) and quote marks (categories Pi & Pf).

White space is not text. Thus, e.g., both in English and in French the comma will normally get the style of the preceding word, but the semicolon in French is preceded by a thin space, and thus will remain in its own style:

... du <em>beaujolais</em>, dit-il. → … du beaujolais, dit-il.

... a <em>bird</em>, of course. → … a bird, of course.

... <em>le pont</em> : toujours. → … le pont : toujours.

... a <em>compound</em>: several. → … a compound: several.

See also Text fix-up for an alternative syntax.

Issue: Would a Spanish opening exclamation mark (¡) or question mark (¿) that happens to be before an italic phrase be put in italics, too?

Including punctuation in quotations

If a phrase is styled not by making it italic or bold, but by enclosing it in quote marks, punctuation is typically not moved inside the quote marks, except in American English: The tradition is that periods and commas that logically belong outside the quoted phrase are nevertheless moved inside: ‘Hi,’ he said. vs ‘Hi’, he said.

This could be handled with another keyword for the 'punctuation-styling' property suggested above: 'punctuation-styling: small'. See also Text fix-up for an alternative syntax.

Replacing quotes

The 'quotes' property is an easy way to change or suppress the quotes that are generated around elements, in particular the Q element. But if the quote marks are already in the text and we want to replace them, we either need a way to suppress them, or a way to substitute different characters. (See Possible enhancements to the <q> element in Quote marks for a discussion of quote marks in text.)

The 'text-replace' property (see Replacing glyphs) might help, but it is not necessarily precise enough.

Possibly a better option is a 'suppress-quotes' property to suppress just the opening and closing quote inside an element, i.e., the first and last non-blank characters of the element, and only if they have the Unicode property Quotation_Mark.

Some people argue that quote marks, if added manually, should be outside the element that represents the quotation, i.e., ‘<q>Hello</q>’ rather than <q>‘Hello’</q>. If that style is used, 'suppress-quotes' should be able to express that, too.

Or not a property but a keyword: see Text fix-up.

French punctuation spacing

In most typographical traditions it is no longer customary to put a (narrow) space before punctuation, but in French it is still very common. No space is put before small punctuation (. and ,), but a narrow no-break space is put before exclamation marks (!), question marks (?) and semicolons (;) and a no-break space before colons (:).

A no-break space is also put after the opening quote mark in French (called guillemet: «) and before the closing quote mark (»).

Different typographers differ a bit in how they apply these rules: some use only no-break spaces, some only narrow no-break spaces. It is difficult to type no-break spaces on a typical computer keyboard, let alone narrow no-break spaces, so authors typically just put a space and hope that it won't look too bad, and, especially, that there won't be a line break at that space.

But CSS might be able to fix these spaces. A property could turn on an automatic correction that inserts or replaces spaces around these punctuation marks. It would need to allow the designer to specify which spaces he wants: no-break spaces, narrow no-break spaces, or a mixture based on the punctuation mark.

Em-dashes and en-dashes

Em-dashes are often used in American English in pairs as an alternative for parentheses, or one at a time instead of a comma or ellipsis. American typographers normally write them without spaces around the dashes: word—another.

In European tradtions, dashes can be used the same way, but European typographers tend to prefer the shorter en-dash with spaces on both sides: word – another.

It would be nice if one style could be changed into the other with CSS, but HTML has no mark-up for parenthesized remarks or pauses. Some sort of text replacement might help.

Text fix-up

Rather than several properties to fix up the typography for logically marked-up elements, there could be a single one (or a shorthand):

q {text-fix: suppress-quotes include-punctuation}

The default would be 'none' and other possible keywords are 'suppress-quotes', 'suppress-quotes-around' (see Replacing quotes), 'include-punctuation' (see Logical vs typographical punctuation), 'include-small-punctuation' (see Including punctuation in quotations), 'french-spaces', 'narrow-french-spaces' (see French punctuation spacing), 'long-dashes' and 'short-dashes' (see Em-dashes and en-dashes).

The property is inherited.

Italic correction

When italic text is followed immediately by roman text, such as an italic word followed by a closing parenthesis or a closing quote, some extra space is needed to avoid overlap of the last letter and the punctuation mark: the italic correction. The size depends on the shape of the letter. An “o” needs no extra space, but a “d” does.

Like the typographical punctuation above, this could be handled automatically with heuristic rules, aided by a property to turn the rules off if the designer rather controls the space manually: 'italic-correction: auto' (default) vs 'none'.

Ruby

CSS has a module for ruby (small annotations above letters, especially in ideographic scripts). css3-ruby has been worked on for a long time, but is still in Working Draft state.

Mathematics

The inclusion of MathML in HTML5 is a big step forward for the publication of documents containing mathematics. Previously, you could combine MathML with XHTML only by means of namespaces (which leads at most to a syntactically valid document, but not to a standard format with defined semantics, which can be supported by software).

Unfortunately, despite initial efforts towards a draft already in 1999, the CSS WG never managed to publish a Working Draft for mathematical typography.

The Math WG, in order to help initial support of HTML5, analyzed the existing CSS and published a sample style sheet, together with the subset of MathML that could at least be rendered in a readable way.

But to do proper math renderering, and to support the rest of MathML, CSS needs new kinds of boxes (values for the 'display' property) for built-up formulas, properties for stretching operators, properties for baseline alignment, properties for line breaking in formulas, etc.

Equation numbers aren't part of math proper, but are common in texts that contain formulas. (They are also used for other kinds of displayed content, such as chemical formula or examples of phrases in linguistics.) Different publications have different traditions for their style (placement left or right, content and alignment). Some study is needed to see if CSS can describe all styles. (See also Equation numbers for using labels in cross-references.)

Math-like

Chemistry, linguistics and other disciplines also use layout conventions that look somewhat like formulas or diagrams. Chemistry, e.g., uses subscripts and superscripts together, and they should be aligned vertically. Linguistics uses equation numbers, and also displays of multiple lines, aligned at certain points and bracketed with big parentheses.

Accessibility

Generated content

If a screen reader reads the paginated document, trying to interpret the visual layout, it may or may not read text generated with the 'content' property. Text generated for running headers (inside '@page') is probably not very useful, while text generated inside '::after' and '::before' probably is.

Generated navigation menus

When a table of content or an index is generated by the style sheet, it is probably useful for a screen reader to be able to read it.

Interactive vs static

The media 'print' in CSS is defined as static. However, when a document with a 'print' style sheet is read on a screen, it could also be interactive: forms and hyperlinks could in theory still work.

If the media is known to be interactive, some styles could be different. E.g., it may not be necessary to add page numbers after links when the links can still be activated as hyperlinks.

Maybe there is a need for a new 'interactive' media feature ('@media print and (interactive)') or a new media type ('@media e-reader').

Transclusions

Transclusions are links that are expanded in place. E.g., the typical way to render an IMG element in HTML is to expand it in place. Transclusions can be rendered inline by default (such as the IMG element) or after user activation.

Seamless IFRAMEs

CSS has a notion of “intrinsic size” for transclusions, which is meant to allow images to be rendered inline at their normal size. But that size is only defined for transclusions that have a fixed width-height ratio. The SEAMLESS attribute on HTML5 indicates that an IFRAME is meant to be transcluded at its intrinsic size, even if it consist of text It should act just like a normal element: CSS sets the width and the height depends on the content. In particular, a narrower height will typically result in a taller height, unlike for transclusions with a fixed aspect ratio, where a narrower width also results in a smaller height.

Multiple source documents

An e-book is typically made up of several files, e.g., one per chapter and a small main file to indicate the right order. But such a set is still rendered as a single document, with, e.g., continuous page numbers.

This is probably the same problem as the seamless IFRAMEs above: each chapter is a transclusion in the main document. Only the particular mark-up differs. (In EPUB3, the main document has a list of <item href=…> elements, each pointing to a chapter.)

Media types and features

Simultaneous visual and speech rendering (“reader”)

CSS Media Queries defines only a handful of media types and media features. It is not clear what type, e.g., an e-book reader falls under, or an e-book reader that can also speak.

There is an old proposal for a type 'reader' for a device that renders both visually and in speech, with the speech and the visuals synchronized or alternated under user control. (It was originally intended for assistive technology.)

Second screen

The term “second screen” refers usually to the fact that many people now watch TV with a laptop, tablet or smartphone standby at their side, so they can immediately verify something they see on TV, get more information, chat with other viewers, or give feedback.

The second screen can also be part of an integrated system with two screens working together, e.g., a presenter can show slides on a big screen, while controlling them from a small screen; or a video editor can show the video on one screen and the editing controls on another.

CSS currently has no media type for such a two-screen system and no properties to position content on one screen or another. All visual CSS properties assume a single “canvas” and a single “viewport” onto that canvas.

Acknowledgements

With thanks to Liam Quin.