[Workflow submission] Bert Bos - The limits of "single-source publishing" with XML and CSS
Position paper:
The limits of "single-source publishing" with XML and CSS
By: Bert Bos (W3C)
The promise: multiple style sheets via Media Queries
----------------------------------------------------
The model for single-source publishing with XML (or HTML) and CSS is
that the only thing needed to add another kind of output is one more
style sheet. Each style sheet is labeled with a "Media Query" that
describes the kind of media the style is meant for. The process that
produces a certain output only has to select the style sheet(s) with the
appropriate label and ignore the others. E.g., 'speech' is a Media Query
for a style sheet for a speech synthesizer, 'print and (color)' for a
color printer, and 'screen and (max-width: 40ch)' for a screen that is
40 letters wide or less.
In addition, the model says that style sheets are often reusable, i.e.,
there are classes of documents that are similar enough that the same
style sheets can be used for all of them. E.g., all books in a certain
series, or all articles for a certain conference can share style sheets,
with maybe minor overrides for specific cases.
The problems
------------
In practice, the model works less well than intended. The reasons fall
into several categories:
1) The mark-up of the document isn't always sufficiently rich. To be
able to distinguish and style different parts of a document, the style
sheet relies on element types, attributes and context (i.e., element
types and attributes of nearby elements). If the author hasn't marked-up
different kinds of contents differently, the style sheet cannot style
them differently. This may happen, e.g., when the author hasn't realized
that a certain kind of content needs different treatment on a particular
kind of media. E.g., table mark-up is often not rich enough if the
author hasn't thought about speech output, or an image caption isn't
distinguished from a normal paragraph and thus cannot be restyled when
the image is moved or scaled on a small screen.
2) The Media Queries do not have sufficiently many media types and media
features, or those that exist aren't clearly defined. The media types
are only nine, and three of them (screen, handheld and print) cover
almost the whole range of visual media: paper, e-readers, smartphones,
tablets, laptops and desktop screen. Only video projection is separate.
The official specification offers no help to decide if an e-reader is a
'screen' or a 'handheld' or a 'print'. And the media features, of which
there are thirteen and which are meant to distinguish subclasses of each
media, do not help much either. They do distinguish screen sizes, but
not, e.g., if the display is paginated or scrolling.
3) CSS isn't sufficiently rich to re-order content. The document is,
ideally, marked-up in a logical order that makes the document the most
understandable if it is rendered with no style sheets at all (apart from
the default style of the renderer, such as the default HTML style in the
case the document is HTML or XHTML). But on a large screen, the layout
is two-dimensional, and on a small screen the layout is, although
linear, not necessarily in the original order. Proposals for layout
"templates," which would allow reordering, were made in 1996 and updated
in 2005, but have so for not passed the stage of prototypes. Combining
CSS with XSLT, while theoretically a solution, is rarely done in
practice, for various reasons. E.g., EPUB3 does not include XSLT.
4) Implementations do not follow the specifications. Sometimes
deviations from the specifications are simple bugs and are a temporary
problems. But sometimes they cannot be fixed so easily. This is a the
case, e.g., with screen readers. The model of single-source publishing
mentioned above is that a screen shows the result of applying one style
sheet and the speech synthesizer speaks the result of applying another.
In practice, screen readers try to interpret the screen instead, because
the speech style is too often of very low quality and the screen style
often adds important information that is not found in the document
itself. But as long as screen readers work like that, designers cannot
use the screen and speech styles as intended. Chicken and egg... EPUB3
includes support for speech style sheets and some e-readers offer speech
output. However, initial reports suggest that they aren't as accessible
as they should be.
5) Transclusion is not well handled by CSS. This includes the common
case that a document consists of multiple parts: not just a text and
separate images, but several text parts, e.g., one file per chapter.
EPUB describes a packaging format for such multi-file documents, but CSS
has no support for it. Giving the main file (the one that links to all
the chapters) to CSS can at most result in a rendering where each
chapter is a fixed-size, scrolling box. The SEAMLESS attribute of
IFRAMEs in HTML5 is an example of the same problem, for which CSS has no
solution yet.
Ideas towards solutions
-----------------------
1) Mark-up formats such as XHTML (which is used in EPUB, e.g.) are
probably rich enough to express all the different roles a piece of text
can have, at least sufficiently to allow styling. Solutions for
improving mark-up may involve educating authors or giving them automated
tools (validators and lint-like tools).
There are also cases where the mark-up is rich enough in principle, but
CSS selectors aren't yet powerful enough to make use of it. A typical
case is that the distinguishing factor between two elements is something
that is *inside* the elements: E.g., a section should be styled
differently when it starts with a heading than when it contains no
heading. There are proposals already being discussed in the CSS WG that
should solve this particular case and several others.
2) The list of Media Types in the Media Queries dates from 1997 and 1998
and is based on predictions about devices that didn't exist at the time.
The Media Features, which distinguish subclasses of those media, were
defined in 2001 (although they didn't become a standard until 2012). It
is probably time to make a list of current and expected devices and
define how they map to each media type and media feature, and, where
needed, define new types or features.
E.g., e-readers could be a new media type. Or they could be classified
as 'print', but distinguished from paper by a new feature
'(interactive)'. Or they could be 'handheld' with a feature
'(paginated)'. Another example: An electronic billboard (the current
terminology is "digital signage") is like a screen, but it is not
interactive. However, it is dynamic and can support animations, unlike
paper.
3) CSS does not support document transformations, such as those provided
by XSLT. That is to keep CSS easy to understand and use, and to better
support WYSIWYG editing of documents. But even so, it could in theory do
much more than it does now. (It currently has the concept of float,
which allows elements to move left or right in a limited way, it has a
'caption-side' property, which sometimes allows an element to move above
or below a sibling, and the "flexbox" properties; but all of these can
at most reorder siblings and they have other limitations and side-
effects, because they were meant to solve very specific problems.)
The idea of a "layout template" is a very old one and one that appears
easy to understand and easy to use in CSS. A template defines one or
more "regions" that are laid-out on a grid or other suitable layout
framework. The regions can optionally be combined (a.k.a. "chained," as
a way to make disjoint regions). The elements of the document are then
each assigned to a region or chain of regions.
CSS in fact already has a kind of layout template, in the form of a
predefined page template. This is a simple template designed to handle
the most common kinds of running headers and footers in paginated
output.
(It is probably possible to use layout templates also as page templates,
when the predefined page template is not enough. The behavior of running
headers is, in fact, not a function of the template, but is defined by
the 'content' property of CSS.)
Nevertheless, it is probably not possible to rely on templates for all
layout. Sometimes document transformations are still necessary. E.g.,
there are no template-based proposals for re-ordering speech yet. And to
style documents such as RDF, which has no defined order, some
transformation to a known order is needed. In the case of RDF, that
might be done with SPARQL.
4) The problem that bugs cannot be fixed because the resulting user
experience would as often be worse as it would be better, is probably
not easily solved with technical measures. Maybe if speech synthesis
improves enough that people will *want* to listen to their book instead
of reading it, or if browsers and reading devices offer an easy switch
so that the user can try which of the document's different style sheets
gives the best result, then authors and designers will do more effort to
use the features that make each media interesting.
There is a lack of knowledge, at least a lack of widely shared
knowledge, and probably a lack of research, on how to make e-readers
accessible. E.g., is it interesting for a listener to know that the
visual representation is paginated, and if so, what parts of the
generated text (running headers, page numbers, references to page
numbers, etc.) should be spoken and which should be omitted?
5) The notion of "intrinsic size" in CSS is so far only defined for
external resources with a fixed aspect ratio. CSS can handle
transclusions where, when the style sheet sets the width, the height is
always the width times a certain constant. But the intrinsic size of a
text document works differently: when you increase the width, the height
typically *decreases,* and not in a linear way. Maybe all that is needed
for an author is a simple keyword to indicate that the height depends:
'height: min-content'. But there may be security/privacy implications
that make the definition tricky. If the device supports JavaScript, a
script may derive information from the resulting size of a transcluded
resource (e.g., that it failed to load) and send that information
somewhere, even if it cannot see what is inside the transclusion.
Giving a transclusion its proper height is enough to make it visually
part of the document. It doesn't become part of the DOM and it keeps its
own style sheet, which also means that its margins do not collapse with
the element it is transcluded in. But that is probably not more than a
minor inconvenience.
The situation is different for speech output. The transcluded document
should be read out loud, reading should continue in the outer document
when the transcluded one ends, and navigation should be possible almost
as if the transclusion was an actual part of the outer document: E.g.,
skipping to the next heading should skip into the transclusion if the
next heading is there and out of it if there is a next heading in the
outer document.
Some research is needed. Maybe in this case a new keyword on the 'speak'
property is useful. Or maybe not: the 'content' property can already
express that an external resource is transcluded. Maybe all that is
needed is to define how speech works for transcluded content that
contains text.
Conclusion
----------
The idea that a single document overlaid with a single mark-up is very
often enough for publication on widely different media has proved to be
correct to some extent. But style sheet technology hasn't kept up with
the appearance of new media (or new variations on old media). CSS is
also lacking functionality because transformation languages such as XSLT
are not used as much as expected.
Research is needed, and in part already underway, to draw up new
requirements, in particular to enhance Media Queries and to improve,
replace or complement CSS.
Bert
--
Bert Bos ( W 3 C ) http://www.w3.org/
http://www.w3.org/people/bos W3C/ERCIM
bert@w3.org 2004 Rt des Lucioles / BP 93
+33 (0)4 92 38 76 92 06902 Sophia Antipolis Cedex, France