W3C

Bert Bos | Can you typeset a book with CSS?

Can you typeset a book with CSS?

Photo: manual layout in an old
   manuscript Photo: mixed writing
   modes in a modern magazine

Bert Bos (W3C) <bert@‌w3.org>

eBooks & i18n: Richer Internationalization for eBooks
(2nd W3C Workshop on Electronic Books and the Open Web Platform)
Tokyo, Japan, 4 June 2013

Summary

Assume

At some stage in its production, a book, magazine or e-book consists of (X)HTML or XML

… and maybe MathML (or HTML5), PNG, JPEG, SVG, audio, video and metadata

Question

Can you then use CSS to typeset it?

… for paper, for PDF, for an e-book or for the Web

Yes?

At first sight, yes,

Can you typeset a book with CSS?

At first sight, the answer seems obviously yes. The book Cascading Style Sheets, designing for the Web, which Håkon Lie and I wrote back in 2005 (for the 3rd edition), was written in valid, clean HTML, with images in PNG, and a CSS style sheet to turn it into “camera-ready” PDF for the printers. And now, in 2013, publishers are using CSS every day to make books.

No!

On closer look, no,

But on closer look the answer has to be no. Although our HTML and PNG were standard, we had to do a number of tricks to get the layout needed for a real book: e.g., we used TeX for the hyphenation and we used a number of proprietary extensions in the formatter. (We used Prince.) In fact, Michael Day, the man behind that formatter, on several occasions enhanced the software specifically for us. People who are making books still rely on such tricks and proprietary extensions in the software they use.

Why?

Why can't CSS typeset books?

But the demand on CSS is increasing

… so let's assume we want to extend CSS

Why is that? Our, i.e, the CSS Working Group's, excuse has always been that (1) CSS was designed only for simple layouts and meant to be easy to use, and there was XSL (XSLT and XSL-FO) for advanced publications, especially for print; and (2) several of the requirements of book publishing are actually quite difficult to solve. It takes time to understand them, acquire the expertise, analyze the solutions, test them…

But the demand on CSS is increasing, especially now that XSL-FO development has stopped and we don't know when and if it can restart. And so it is time to look seriously if CSS can acquire the needed functionality, and if so, how.

CSS has shown a longevity and a capability to grow that I certainly didn't expect back in 1994-1998, even though I designed it to be extensible. On the other hand, the increased size already means that it isn't easy for people to learn CSS anymore and we should ask ourselves if it isn't better to leave CSS alone and create a new style sheet standard that, from the start, is meant to be good enough for complex publications.

But, for the purpose of this talk, let's assume that we want to add functionality to CSS and let's look at some examples of requirements and the solutions that have been proposed for them, if any.

See also

Other lists (more complete than this talk)

I've started collecting requirements in a document (List of CSS features required for paged media). It is still rather unstructured and incomplete, but contains already enough to give a sense of the size of the task in front of us, even if we don't do it (all) in CSS.

Although not all the requirements in the list are only for paged media, many of them are more important in paginated renderings than in a scrolling display, and thus it seems useful to refer to them collectively under that theme.

The XSL WG collected a list of requirements for XSL-FO 2 in 2008. There are many requirements on that list that aren't in mine yet and that also have to be considered for CSS at some point.

Element-based
vs region-based

Original, simplifying assumption for CSS:

The mark-up forms a tree

In simple layouts, the style mostly follows the mark-up

But: extensible to document-independent regions if necessary later (@-rules…)

When we designed CSS, we made the simplifying assumption that, at least in a single-column scrolling layout, the bulk of the style closely follows the mark-up structure. And so we concentrated on that. The fundamental model of CSS is to take the document tree and add style to every element and only to elements: a bit of margin, a color, a font, maybe a list number, etc. Where typography required style that didn't follow the semantics, we thought we would add some ad-hoc exceptions (such as 'first-line' and 'first-letter') or just ignore it.

We did, however, build in a way to extend CSS later, if necessary, with ways to create visual structures that were independent of the document tree. The primary hooks for such extensions in CSS are the so-called at-rules (or @-rules). E.g., already in 1996, even before we standardized CSS level 1, we published a note about possible ways to add page templates to CSS. (In ways too complicated to explain here, driven by the browser wars at the time, that note led eventually to the inclusion of “absolute positioning” in CSS level 2, a feature that has very little to do with the original idea and that finally nobody liked; but such is history.)

Of course, even in simple documents, typography already doesn't always follow the mark-up structure. E.g., if you decide to mark-up foreign words (<span lang=de>Buch</span>) and then the designer decides to typeset such foreign words in italic ([lang] {font-style: italic}), then you will have a problem if the foreign word is followed by punctuation, because, for reasons of aesthetics, the punctuation should then also be italic. We decided to just ignore that problem.

But as soon as you want to do more interesting layouts, the assumption starts to hold less and less. In a book, there are, e.g., running headers and footers. They do not correspond to any element in the document, even if the content is often derived from the document in some way. Other examples are tables of contents and indexes. They are likewise derived from the document, but you would typically want them to be created by the computer, not by the author. And even though they aren't elements in the document, you want to be able to style them.

Requirements

Let's look at some requirements

Running headers

Running headers example

Photo: A running header with a
    bit of math

Looks simple enough, but the best we can do so far is:
… integral ∫0bxp dx when…

Let's look a bit more closely at running headers. We found that we could, with the at-rules I mentioned before and a single predefined page template, provide something that was at the same time flexible enough for quite a number of books and simple enough to be understood by most people. This became the css3-page module (CSS Paged Media Module Level 3). You can put text in various places around the edge of each page, differently on left and right pages, even extract some text from the document for that purpose to some extent, and style the text, also to a limited extent.

What are those limits? And what if we need more?

Simple cases

The text in the running headers can consist of fixed text (the same throughout the document), text copied from elements in the document and counters (generated numbers, such as page numbers) or a mixture of those. You cannot manipulate the copied text (modify it, do calculations on it).

There are a total of thirteen boxes and each box has only one style (a single font, a single color, etc.) They have just enough freedom in their placement and size that you can sometimes combine two of them in the same spot, but not more than that.

Actually, the possibility to copy text from elements isn't in css3-page, but in css3-gcpm (CSS Generated Content for Paged Media Module), which is only a Working Draft, but there are experimental implementations of that feature and I expect that it will become a standard eventually.

Special content

Sometimes the content of a running header is neither constant throughout the document nor a copy of some heading in the document. This may happen in technical manuals, for example, where the running headers act a bit like section headings to structure the text, except that they aren't printed in the middle of the page, but only at the top and repeated on every page of that section.

The title element in HTML is also an element that is normally not displayed in-line, but can be used for a running header. (However, it has no sub-structure.)

The already mentioned css3-gcpm contains a proposal for designating elements as “running elements.” Such elements can have mark-up (child elements) and be styled, allowing for running headers that aren't limited to a single line of text in a single style anymore.

This, however, is no more than an early proposal and what it will turn into when we seriously start to discuss it I can't predict.

The Prince formatter that I mentioned earlier has a similar feature with a different syntax. I'm interested in hearing what people who used it think of it.

For people who know XSL-FO: the CSS proposals have similarities to the retrieve-marker element in XSL-FO and it may be worth looking more at how that element works.

Copies of complex headings

In some books the running header is “just” a literal copy of some section header, but those headings have sub-parts. E.g., the heading may have mathematics in it (even just a superscript), or contain bidi-text that requires explicit mark-up (direction overrides). In such cases, applying a single style to the whole of the copied running header is not sufficient. The different parts of the header need different styles.

Photo: A running header with a
    bit of math

The running header looks simple, but it contains a math formula.

A quick survey (not at all scientific) among a couple of mathematicians showed that about a quarter to a third of the math books in their possession contain at least one such running header. Which is enough to conclude that we cannot ignore this requirement, especially now that HTML5 contains math.

There is, as far as I know, no proposal yet for how to handle this in CSS. Even css3-gcpm only mentions it as an open issue. I wrote in my list of requirements that a solution will probably require a CSS selector that distinguishes an element based on where it is used, maybe with a pseudo-class (':original', ':copy'). There is work in CSS on a generic solution for styling elements based on the region they are in, rather than their position in the document. So far that doesn't handle running headers, but that may come.

Running headers that are not at the top

The simple, predefined page template that CSS offers (in css3-page) is only good enough if the running headers and footers are along the edges of the page, outside the page body. You can cheat a bit with their margins and make them overlap the page body, but if you want, e.g., a page number right in the middle of the page, you'll need another kind of page template.

The existing proposals for how to make such page templates are based on the idea of dividing the page into regions along imaginary grid lines. Each region can either be part of a “flow” of text or be a running header with repeating text (or remain empty, of course). E.g., css3-layout (CSS Template Layout Module) contains this example. (This is from the Editors' Draft, although page-based templates as such are mentioned in the latest published Working Draft, running headers are not.)

@page {
 grid: "t1"  1.2em    /* space for 1st running header */
       "t2"  1.2em    /* space for 2nd running header */
       "."   2em      /* 2em of empty space */
       "*"            /* page body */
}
::slot(t1) { content: string(chapter); color: red; text-align: center }
::slot(t2) { content: string(section); color: green; text-align: center }

This uses the 'string()' syntax from css3-gcpm, which only allows a single style for the whole string, but the point of the example is to show that you can define arbitrarily many regions for running headers (here just two), give them names (here t1 and t2) and position them anywhere on a grid (here a simple grid with four rows and one column). E.g., to put the same two running headers side by side at the bottom, the grid template would be changed to this grid of three rows by two columns:

@page {
 grid: "*  * "          /* page body */
       ".  . "  2em     /* 2em of empty space */
       "t1 t2"  1.2em   /* two regions side by side */
}

(This syntax is the compact, shorthand form, which is meant for advanced users. Beginners may want to use the longhand, which uses three separate properties.)

To put a page number in region t2, the syntax is then the same as for the predefined page templates, with '::slot(t2)' replacing the name of the predefined region:

::slot(t2) { content: counter(page); color: green }

Miscellaneous 1

Grid example

Scan from a
    magazine.

Note that the structure of the grid does not directly correspond to the structure of the document. Also, there is variable space above the blue paragraphs.

Before we look at another one of the requirements in detail, let's quickly list a few that I don't have time for in this talk.

Footnotes

Footnotes are a vast subject. In the simplest case we have one type of footnotes numbered consecutively throughout the document and they are inserted at the bottom of the page. Even then the complication is already that a footnote should preferably be on the same page as the word it belongs to, but that isn't always possible. And you may want a horizontal rule above the footnotes, but only if there actually is a footnote. It gets more complicated with numbering per page, which requires multiple passes (which may not converge), with multiple types of footnotes, numbered separately, with footnotes positioned under columns instead of under the page, and with marginal notes. A subtle issue is also to make sure the footnote number is immediately after the word it belongs to, even if the mark-up has a space between that word and the footnote…

css3-gcpm contains some early ideas for footnotes in CSS and there are proprietary extensions in some software.

Cross-references

Hypertext has active links: you click on them and within two or three seconds the computer shows you what they refer to. But in a book you have to find the target of the link yourself. For that purpose, they usually contain a page number or a section number. That number is usually generated automatically, because the author doesn't know the number when he writes the text, and indeed it may be different in different printings of the same book. Ideas for relevant CSS properties are in css3-gcpm and there are also proprietary extensions in some programs.

A variant of such a link occurs when a text is split over non-consecutive pages, e.g., an article starts on the front page of a newspaper and continues on page 4. The author doesn't know where the break occurs, probably doesn't know on what page the article continues, so this link also has to be inserted by the computer.

Maybe you want these references to be more fancy (more natural, in a way) and replace “see page 3” if it occurs on page 2 by “see the next page.” Or use words: “page three” or even ordinals: “see the third bullet”…

Continued on…

When a page break occurs, especially if the rest of the text is not on the next page, but in some box elsewhere, it is useful to automatically insert some text, such as “continued on page 3.”

There is a proposed 'text-overflow' property in css3-ui that can insert a fixed text when a box overflows, but it is not clear if that applies to page breaks and it cannot currently insert a page number either.

Page floats

CSS provides for floating content, which is content that is not shown inline, but somewhere off to the side, but still near to the text it relates to. In paged media, it is usual to float not to the side, but to the top or bottom of the page. In some cases you don't care if it is on the same page or the next one, and sometimes you do.

In case content floats to the side, you may want to float it to the outside edge, away from the spine of the book. And if you have columns, you may want the float either to go to the edge of the page or to the edge of the current column. Because the choice for left or right is made by the computer, the typographer cannot know whether he should set a left or a right margin on the floating content. Maybe properties 'margin-inside' and 'margin-outside' are needed, whose values are added to the left and right margins as appropriate.

For this also, there are ideas in css3-gcpm and there are proprietary extensions in programs such as Prince. (In fact, these ideas were already described in internal memos in 1996, but books weren't an important target for CSS back then.)

If you refer to a floating illustration, you might want to refer to “the figure below” or “the figure on the next page” depending on where the figure ends up.

Shapes

Floats, especially if they are images, need not be rectangular and you may want text to wrap around them tightly. Also, if you make a page template, even if it is based on a grid, you may want some of the regions not to be rectangular. A pair of drafts, css3-exclusion (CSS Exclusions Module Level 3) and css-shapes (CSS Shapes Module Level 3) contain proposals for this.

Aligning to a grid

I already mentioned page templates, but css3-layout also defines element-based templates, because you may want the content of one element (typically a large element, such as a DIV, but not necessarily) to be laid out in a somewhat tabular fashion. Absolute positioning doesn't allow easy alignment and tables can only lay out contents in a fixed order, but templates have neither restriction.

Element-based templates aren't just useful in paged media, of course. But in paged media you often have to work with fixed heights, which means in turn that you can align things to the middle of a column or to the bottom. E.g., you may want four news articles in four columns side by side but aligned at the bottom rather than the top. This scan from a magazine shows an example. Note that the text is aligned at the top and at the bottom, by stretching the space between the last and last but one paragraph:

Scan from a
    magazine:

This magazine page has four columns, each with one article consisting of: a photo (the four photos have different sizes but are aligned at their bottoms), a heading (of different sizes also, aligned at their tops), a first paragraph, a variable amount of space and a second paragraph that is aligned to the bottom of the column.

css3-layout contains proposals for grid templates and for alignment of content inside the regions of that grid, but only to align all the content of a region to one side (the same model and syntax as for table cells). However, in css3-box (only in the editors' draft at the moment: CSS basic box model) there are some ideas for stretching margins as in the scan above. The method is similar to what is used in css3-flexbox (CSS Flexible Box Layout Module), but the syntax differs, because the syntax of css3-flexbox cannot be used in normal, flowing text.

Aside: CSS's modules for alignment of elements in GUIs,, in particular css3-flexbox and css3-grid-layout, at first sight look as if they could be used for documents as well. css3-grid-layout and css3-layout indeed use some of the same properties. But css3-flexbox and css3-grid-layout only allow to align single elements, not flows of multiple elements. In other words, they require the mark-up to be modified based on the desired layout. (Which is OK for GUIs, because there the mark-up is part of the style, sometimes called the “skin.”) Also, they have no concept of chained regions, which is necessary to allow content that starts in one region and continues in another.

Page spreads

Sometimes some table in a book is so wide that it needs two pages, or a headline in a newspaper is so important that it needs to span from the left edge of the left page to the right edge of the right page.

It is not as simple as formatting the content on a page of double the size, because there is the spine of the book and you cannot print too close to it. You also don't want half a word on the left page and the other half on the right page.

I know of no proposals for specifying page spreads in CSS.

Tables of content

The table of content can usually be generated in a separate stage, after the author finished writing the text and before the style is applied. Only the page numbers have to be filled in during the formatting (which may in theory require multiple passes). But in case you ship an electronic version of the book, rather than paper, you may still want the ToC to be generated at the “client side” so as to ship as small and clean a source document as possible. You could use XSLT or JavaScript, but it could also be added to CSS (although there are no proposals for actual syntax, as far as I know).

Hyphenation, line breaks, micro-adjustments

Good-looking line breaks are important even in scrolling displays, but when reading from paper the user cannot make the text a little wider or narrower, so it is more important that the line breaks are right. Finding the optimal balance between hyphenation, looser or tighter setting, and difference in the amount of space in neighboring lines can be difficult in some cases. If the display is also interactive, the designer might want to give hints about how much time the computer may spend searching for a solution.

Some typographers will go as far as modifying the letters slightly: making the letters a fraction of a point narrower is invisible to the human eye, but may be enough to fit one more letter on the line and avoid an ugly hyphen. Or squeeze or stretch the line height a tiny bit such that the page can fit one more line or one line less, and avoid a nasty page break. (Line height adjustment is sometimes referred to a “feathering” or “text feathering.”)

Leaders and tabs

Leaders and tabulation are also not exclusive to paged media. Tabulation is subtly different from tables, in that contents is aligned to a “tab stop” but isn't contained in a cell and can continue on the same line past the next tab stop and even wrap to the next line. Here is an example, approximated in ASCII:

Hotel . . . . . . . . . . . . 375.55
Travel  . . . . . . . . . .  1460.10
Miscellaneous, including presents
and tips  . . . . . . . . . .  84
Total . . . . . . . . . . .  1918.65
                            Ph. Fogg

CSS can almost do this example, with the leaders in css3-gcpm (which are officially still a draft, but seem quite stable). The “almost” refers to the fact that the numbers in this example aren't aligned at the start or the end, but at the decimal point, a feature the current leaders do not provide.

A little more complex still is a tabular rendering with more than one tab stop:

Coffee          USD    2.00
Tea             USD    1.75
Train           EUR   67.50
Hotel (including Berlin
and Paris)      EUR  450.00

An old proposal handled this, but it was abandoned (in 2005) in favor of the easier, but less powerful, model currently in css3-gcpm.

Bookmarks and other metadata

In an interactive display, there are usually things outside the “viewport” (as CSS calls it) that somehow depend on the document shown inside. The title is usually visible somewhere (in a menu, as a window title, in a list of bookmarks or history, etc.), but other things possibly also. E.g., PDF viewers have a bookmark menu with useful entry points into the document (section headings, important topics), which are based on information provided by the author as part of the document. One proposal (in css3-ui) also provides an icon for the document, to be used, e.g., in search results or in bookmarks.

User interaction

If the document is displayed in an interactive viewer, certain actions, such as following a hyperlink or turning a page, could be styled as animations by the designer, in order to show that there are different kinds of links or different kinds of pages.

E.g., in a complex document, the designer might want to use a spatial metaphor to distinguish different kinds of navigation: the next section is “behind” the current one, the next page is “to the right” of the current one. The style sheet need only contain some hints and if the viewer supports the corresponding animations, it will do its best to show the relations to the reader. Some ideas are in css3-gcpm.

Line numbers

Poems and computer code are often printed with line numbers in the margin, for easier reference. The typical method for poems is to number every fifth line, while computer programs are usually numbered at every line.

Other parameters are whether to count empty lines and whether to start counting with 1 on every page.

Copyfitting

Text made
     to fit a given width causes each line, after line breaking, to
     increase its font-size so as to fill the width Text made
     to fit a rectangle cause the font-size to be as big as possible
     such that all the text just fits the rectangle

There are different kinds of copyfitting. The typographer might want to estimate the number of pages a book will have and choose a different font if the book would become too thick or too thin.

But maybe more interesting for CSS is copyfitting on a smaller scale: choose a font size so that a given text fills a given space, or select a different set of style rules altogether if that makes the text fit better, or the other way round: select the text among a set of variants provided by the author that best fits the available space or that avoids an ugly line break.

Typical places where you might want to vary the font size are headings. Some newspapers especially try to make headings that are exactly as wide as the column. Probably there should be a minimum font size as well, because if the text is very long, it is better to wrap it than to ask the reader to use a magnifier.

If the space is only constrained in the width, the font size is chosen such that the text fits on one line, unless that makes the font smaller than the minimum size. The second and subsequent lines may each get different font sizes.

If the space is constrained in height as well, font size is the same on all lines and the last line need not be full. In this case the line height may optionally be made flexible, so that the last line aligns to the bottom.

Miscellaneous 2

A couple more requirements without going into details:

Styling blank pages

It is common in books to start a chapter on an odd-numbered page (i.e., a right page in English, a left page in vertical Japanese). That means there may be an empty page after when the previous chapter ends on an odd page. You may want to do something with such pages: suppress the page number, add the text “this page intentionally left blank” or some such. The css3-page module has a page selector for such pages.

Page size

When viewing documents on a screen, it is normally the user and the device that together determine the page size, but when printing a book, the size of the pages is normally specified in the style sheet. css3-page has a property for that. (See also the section on crop marks above.)

Printing marks

When printing pages, they are often printed on larger sheets of paper and then cut. CSS level 2 already defined crop marks and cross marks.

The style sheet may also be the right place to specify how far outside the edge of the page content should be printed (page bleed), to compensate for slight inaccuracy when the pages are cut. This proposed by css3-gcpm.

Change bars

There are no proposals yet for how to specify that a text should have a change bar in the margin.

User annotations

Scribbling notes in a paper book does not involve CSS, but annotating a book in an interactive reader may involve some CSS to select and style those annotations.

Drop caps

CSS has drop caps and large initials since level 2, but there is very little control over how they look. A drop cap should normally be aligned at the bottom with the baseline of some line. E.g., a large drop cap might sit on the baseline of the 6th line of the paragraph. In CSS level 2, this is a question of trial and error for the designer. There are some proposals for how to make the alignment explicit.

Ruby

There is a draft module (css3-ruby) for typesetting ruby annotations with CSS, but it is progressing slowly. The work done until 2003 was found to lack some necessary features.

(Another presentation in this workshop, by Yasuki Ikeuchi of ACCESS CO., LTD., will talk about requirements on ruby.)

Speech

The user may want to listen to (part of) an e-book instead of reading it on the screen. There is a CSS module for speech synthesis: css3-speech. The module currently has Candidate Recommendation status.

It does not deal with specifying pronunciation. If words shouldn't be pronounced using the built-in rules of the speech synthesizer, th epronunciation has to be specified in the document, but there is no standard for that yet.

Vertical text: history

Vertical text

css3-writing-modes

Other modules

Some languages are always written vertically, others, such as Japanese, can be written horizontally or vertically, but are more often written vertically in paged media.

The css3-writing-modes module (CSS Writing Modes Module Level 3) proposes properties for switching between vertical and horizontal and for the text effects that only occur in vertical, such as rotated letters and combining narrow horizontal letters into a single letter-like box (“tate-chu-yoko”).

But other modules are affected, too. Vertical text changes the interpretation of some properties, e.g.: 'line-height' is interpreted effectively as a line width; 'text-align' acquires new 'top' and 'bottom' keywords; 'direction: rtl' for Hebrew or Arabic inside vertical text is interpreted to mean bottom-to-top. Others are unchanged, e.g.: 'margin-left' is still on the left, the '@top-left' box for running headers is still on the top left, '@page :left' still selects the left-hand page.

Vertical text, such as for Japanese, has been worked on in CSS since early 1999, i.e., during the development of CSS level 2. That makes it one of the oldest unsolved problems, after page templates. A different model was tried in 2001-2003 and even reached Candidate Recommendation status. Microsoft implemented it. But, like the first attempt, it was found to be insufficient. Since December 2010 the WG is developing its third model. Hopefully this will be the right one. (In the mean time, the Japanese Layout Task Force had published its report, which is of great help, at least for Japanese.)

Alphabetic index

Some styles only possible during formatting:

Requirements still unclear

No proposals yet

At first sight, an alphabetic index can be generated before the formatting phase, just like the table of contents, based on the mark-up of indexable terms in the document. Only the page numbers need to be filled in later. But the resulting index may not be quite what typographers want. E.g., if a term occurs twice on the same page, the typographer wants to have the option of only listing that page number once; or if a term occurs on three successive pages (“4, 5, 6”), he might want to indicate that by a range (“4–6”); or if the defining instance and a mere mention occur on the same page, he might want to suppress the latter.

All these improvements can only be done after the page numbers have been established, i.e., by the formatter itself. XSL has properties for this, but nobody has proposed a method for CSS so far.

The alternative (which I suspect many publishers use) is to reserve a certain number of pages for the index, format the rest of the book, and then have somebody make an index by hand, hoping that it will fit in those reserved pages. That works for a paper book, it doesn't work in an e-book, where the page numbers depend on the viewer.

Mathematics

Ideas for CSS properties (from 1999) never published

MathML for CSS Profile – subset approximated with CSS

Some requirements:

I mentioned above that running headers sometimes contain mathematical formulas, but in fact support for mathematics in CSS is all but nonexistent.

The inclusion of MathML in HTML5 is good news for publishing. TEX creates beautiful books, but it doesn't have the semantics of HTML or MathML, and it doesn't work so well for e-books. (It's about time: the first demo of math in HTML I saw already in 1995, in Darmstadt at the third Web conference…)

Until now, you could only combine MathML with XHTML by means of namespaces, but that leads at most to a syntactically valid document, not to a standard format with defined semantics, which can be supported by software.

Unfortunately, despite initial efforts by the CSS WG and the Math WG towards a draft in 1999, the CSS WG never managed to publish a Working Draft for mathematical typography.

The Math WG, in order to help support of HTML5, analyzed the existing CSS and published a sample style sheet, together with the subset of MathML that could at least be rendered in a readable way.

But to do proper math renderering, and to support the rest of MathML, CSS needs new kinds of boxes (values for the 'display' property) for built-up formulas, properties for stretching operators, properties for baseline alignment, properties for line breaking in formulas, etc.

Conclusion

(A Digital Publishing Interest Group may in the future help the CSS WG and other WGs by providing publishing expertise)

In summary, it is not possible to make books or e-books with standard CSS. CSS isn't even up to the level of XSL-FO 1 yet. On the other hand, there are proprietary extensions, in EPUB and in various products, that at least show that CSS can be extended.

Neither those extensions nor any of the other ideas that have been proposed have received a lot of scrutiny. Adding them to CSS is going to take time. Indeed, one may ask if it wouldn't be quicker to make a new style language, not required to be backwards compatible with CSS and from the start designed to handle complex layouts and typography without compromises.

I have collected an initial list of requirements and started looking at how they might be solved within CSS, but that list is far from complete. E.g., there is also the longer list that the XSL WG collected a few years ago for XSL-FO 2 and that hasn't been looked at from the point of view of CSS at all. It is likely that most things on that list will also have to be turned into technology and standardized.

Now is the time to think about how we want to standardize a technology for typesetting books, magazines and e-books. Should it be based on CSS? On XSL? RDF? something else? How many parts should it have: one, as is typical with CSS? Two as for XSLT and XSL-FO? More?

At the same time we should continue collecting requirements and use cases.

There was expertise in the XSL WG and we are at risk of losing it. Hopefully we can create an Interest Group for digital publishing, which can help the CSS WG and other groups in their work.

The end

http://www.w3.org/Talks/2013/0604-CSS-Tokyo

W3C

To Lead the Web to its full potential

To Anticipate the Trends

To Increase your company value

Join W3C

http://www.w3.org/Consortium/join

or contact: Naomi Yoshizawa

Bert Bos <bert@w3.org>
GPG fingerprint: 7744 0204 52A5 14D9 147D
2A13 2D7A E420 184B 5BA4