CSS - an overview

CSS gives (fairly precise) hints about the last steps, shown emphasized. XSL can also specify the last-but one step, from machine-readable representation to structured text/graphic.

But neither of them aims to be complete: they can do common presentations, such as typical books and simple magazines, and even some more complex on-screen layouts, but there are many effective visualizations possible that cannot be expressed with them. Think of the visual effects in some advertisements, or the animations in TV commercials.

And not only are they limited in what they can produce, they are also very limited in what they except: only XML (plus HTML, in the case of CSS). No support for other structured formats, such as those of databases, spreadsheets, CAD programs, calendars, etc. We assume that those formats either come with their own style languages, or are transformed into XML first.

They are limited on purpose. There would be no advantage if they were yet another programming language. The Web can exists because it is modular: different pieces of technology work together, but each of them is of managable size. Paradoxically, the more people contribute to the Web, the more necessary it becomes to stay away from monolithic systems. The coordination would just be impossible.

A tiny part of communication...

A style sheet specifies a view onto some information. That information is already in a form very close to how a human would understand it (it is already text, for the most part), but there can still be a lot done to it to adapt it to the user, the environment, the occasion...

Communication: a wider view

Comminication starts with an idea, which must be represented in a computer language. Some cycles of reflection may be necessary before the representation captures the idea correctly. But the idea has now entered the Web, through the first of its twohuman-computer interfaces.

The network (the Web) transports it, but may also try to enhance or otherwise manipulate it, by automatically combining it with related ideas, by indexing it and linking it to and from other ideas. The Web may also introduce a delay, if the communication is asynchronous. The Web is thus also an archive for ideas.

Eventually, the idea, or what results of it, will be displayed to another human, and that's where the second human-computer interface comes into the play: the one that we saw above.

At this point, the second user can look at the idea from several angles, and form an opinion about it. He will interpret the representation, and may get a new idea, that he wants to communicate to somebody else.

Different views?

The reality on the Web is that the author can try to control what the reader sees or hears, but that he will not succeed. All computers are different, and even if you use images or PDF, some people will not see what you expect them to see. Some people thinks that is a technological shortcoming, from the viewpoint that the author is the ultimate authority on how his ideas are best presented. Others think it is an advantage over other media, because only the reader knows how he best understands what is presented.

In fact, trying too hard to control can make documents inaccessible. It is better to leave reader the possibility to influence the presentatation. CSS was designed to give the reader this possibility.

That's why various accessibility and usability guides on the Web recommend using CSS.

Style sheets not only take away the author's power, they also increase it. Style sheets can do more than proprietary extension of HTML can do. And, most of all, they are very easy to use and re-use.

It's hardly necessary to repeat the above in this forum. People coming to an XML conference have probably heard about the separation of structure and style already. But still it is useful to repeat it here, since it explains much of why CSS is the way it is. And when using style sheets (or developing a style language), it is easy to get carried away and believe that you are designing what the reader will see. Especially when using a WYSIWYG editor, which is perfectly possible with CSS, you may believe that what you get is what everybody else gets.

The elements of style

A style is applied to a document. A renderer (visual, aural, etc.) processes a document, guided by the rules of the style sheet, and produces images on the screen, sounds from a speaker, etc.

The result depends on the document and on the rules, obviously, but also on the environment: the size of the screen, the available fonts, the number of colors, etc.

The styling process - example rules

Most of the rules in a CSS style sheet look like this: one or more selectors and one or more declarations.

The selectors are usually quite simple: a single element, or an element with its CLASS attribute (see below), but the context can get very precise if needed.

The declarations are always of the form keyword colon value, where the value can contain other numbers, strings, and some other things, depending on the property. Declarations can have a flag to mark them as "important" (explained below).

The rules above match EM elements if they are inside an H1, and all CODE elements.

The styling process - algorithm

The algorithm matches the way a document is read when it is coded as a text stream. The order of processing is determined by the document, not the style sheet, which means the style sheet is largely order-independent. It isn't completely order-independent, because the last step in the cascading rules stipulates that, if two rules aren't otherwise distinguished, the last one overrides the first. But this characteristic makes the style sheet that much more declarative and easier to combine with other style sheets.

But it means that, although you can stream the document, you cannot stream the style sheet. To apply the style, you need to have parsed the whole style sheet. However, style sheets, especially in CSS, are usually very short, and they are often shared between related documents.

On the W3C site, for example, there is only a handful of categories of documents (working draft, recommendation, press release, activity page, etc.) and within a category, all documents share a style sheet (apart from exceptions).

Selectors

Selectors identify the set of elements to which a certain style rule applies. An element can be selected based on its name, any attributes it has, or on its position inside the tree. But the position can only be based on any elements that have started earlier in the document stream, since CSS has to work with browsers that support progressive rendering.

The most common selectors are the easiest to write:

simple element names: EM, CODE
elements with their class: EM.foreign, H2.subtitle
elements inside another element: H1 EM, OL LI
elements inside a parent: BODY > H1
elements immediately after another element: P + P, H2 + P
combinations of the above

Pseudo-classes denote dynamic states that an element can be in: cursor is on it, element has been activated, link has been followed, element has keyboard focus, etc.

The :first-child pseudo-class is an exception: it selects the first child element in a parent element.

Pseudo-elements selects parts of elements that are not distinguished in the source document: the first line of a paragraph, the first letter of a paragraph, the text that is generated in front of an element or after it.

Examples of selectors

A matter of class

The class attribute is a simple form of object-orientation for HTML elements. P.intro gets all the style rules that are defined for P, unless overridden by a rule for P.intro. It allows to encode extra information in an HTML file, such as the name of the database fields from which an element was generated.

Multiple classes can be assigned to a single element, to express, e.g., that a certain line in a play is at the same time a line spoken by Mary and a line spoken in whispering.

A group of related documents can share a common vocabulary. The W3C Core Style Sheets, e.g., have classes like warning, offsite, subhead, and mtb (medium thematic break). When you write document you can import the common styles and then only add the rules that are specific to this document.

But XML has no such mechanism built-in. Should we create such a common, cross-format mechanism?

A class action for XML

XML has no standard subclassing mechanism. What do we do? The answer determines how the convenient dot-notation of CSS will be extended to XML.

Let each developer of a new XML-based document add a CLASS-like attribute. CSS then needs a way to specify which attribute is the class attribute.
Reserve the dot-notation for attributes literally named "class". This constrains the format developer somewhat, but it is at least easy to understand. XUL (Mozilla's interface description language) uses class, and MSIE 5 accepts class on any XML file (though that is a risky feature to rely on, of course).
Make a very small namespace. Then the attribute will be called "c:class" or something, which isn't too bad, but authors will then also have to add the namespace declaration to their documents. And it requires namespace support in XML parsers.

Properties

There are 122 properties in CSS2 (= 20 aural + 98 visual + 4 common), for (nearly) everything that has been done with proprietary HTML extension, and much more. The properties are usually easy to understand for anybody with some knowledge of DTP, although non-English speakers may of course have to look up the translation.

There are also properties that apply not to elements, but to the page on which a document is printed. In CSS2 there are only a handful of those, but in CSS3 the possibilities to influence the look of a page outside the document itself will be extended.

Cascading

Cascading is probably what distinguishes CSS most from other style sheet languages. It expresses the fact that on the Web the reader has control. The author and the reader have to work together to get the message across in the best possible way.

But cascading is also very useful for the style sheet designer. The @import mechanism allows style sheets to be split into modules, much like a modern programming language. You can @import multiple complete or partial style sheets and add to them or override them. That helps to keep style sheets very short and managable.

There are in fact three sources from which style sheets are taken when a document is rendered:

The User Agent (often a browser) has a built-in default. For HTML that defaults is more or less specified by the HTML spec and the CSS spec makes it concrete (in an appendix)
The author can attach style sheets to a document, via links in the document or via HTTP headers (if HTTP is used to transport the document). In the future there may also be other ways for the author to associate style sheets and documents, via RDF, e.g.
The user can install one or more local style sheets and direct his UA to use them for all or for certain documents.

Not all UAs make it easy for the user to select his style sheets. In MSIE the user can install one user style sheet and to turn author style sheets on or off you need a thrid-party add-on.

NS 4 doesn't allow user style sheets and you can only turn author style sheets on/off all at once, not individually.

Amaya has one user style sheet, and allows all style sheets to be turned on/off individually.

Opera 3.5 has one user style sheet and allows author style sheets to be turned on/off together, but not individually. A nice touch is that this operation is a single keypress: Ctrl-G

But things are improving for this aspect of CSS as for others: Mozilla has individual control over author style sheets. And it looks like the next versions of other browsers will have similar facilities.

Cascading is !important

Although I said that the reader is in control, it is very easy for the reader to leave things to the author. In fact, the reader can provide fallbacks for things the author doesn't specify (level 4 in the list above), as well as overrides, by means of the !important flag.

Typically, a reader will have a fallback style sheet, for when the author's one is unreadable: he just switches the author's one off completely in that case. But for accessibility reasons, the reader may also have certain overrides with !important: while keeping most of the author's style, he can thus decide to make all colors black and white, e.g., as is done in the rule above.

The rule uses the universal selector, so that the properties will apply to all elements, and says that the color and the background must be white on black (unless overriden by another rule in the author's style, of course).

PART 2 - current developments

What will be in level 3 of CSS?

CSS vs XSL?

CSS level 3

The 3rd level of CSS will obviously add a couple of new features above those already present in level 2. A few examples:

vertical writing (Japanese, Mongolian)
control over running headers/footers and other page margins
properties for describing user interface elements (forms) in their various states of interaction
properties for vector graphics (SVG)

The properties for SVG allow, for example, a single style sheet to control both the text and the graphics on a page, making it easier to make the graphics blend in with the text and vice versa. They also allow a single drawing to be presented with different views, bringing the benifits of multiple views to the world of images.

CSS3 will also be presented as a set of modules, facilitating the use of the technology in various other context, either connected with style or not. More about those below.

The growth of the Web has engendered a change in the make-up of its population: fewer hackers that always download the latest versions of programs and can fix the bugs themselves, more people that buy a browser like a TV: they don't know or care how it works, but they expect it to last for 10 years. That means that bugs and incompatibilities in Web software is much less tolerated now than it used to be.

CSS was the first W3C Recommendation to be accompanied by a test suite (although the CSS1 test suite came out at the same time as the CSS2 spec, which was rather late). The CSS working group has decided to develop the CSS3 spec and the CSS3 test suite together, with at most a month or two between their publications. W3C as a whole has also decided to become active in testing. Eventually there may be a W3C label of conformance, but that is still a long way of, if it ever happens at all.

CSS3 modules sampler

The selectors allow selecting sets of elements from a document tree for other purposes than styling them. E.g., STTS, a transfomation language submitted to W3C uses them to select the elements to transform, as well as to describe what to transform them to. Other proposed formats, including XML schema langauges, propose them for yet other purposes. For those application (which do not have the "progressive rendering" restriction of CSS), there will be an additional level above level 3, with selectors that allow elements to be selected based on what follows them.

The syntax of CSS is very simple to use and to read, and yet quite powerful. It has been designed to be written by hand (even though CSS as a whole had been designed for WYSIWYG editing), and is extensible: the parsing rules specify what has to be ignored and what cannot be ignored, so that new versions of a language can add features without breaking old parsers. It has a type system that includes all the types normally found in programming languages, plus some convenience types: dimensions (numbers with units, such as 12cm), percentages, URLs,... It is very simple, e.g., to express RDF in CSS syntax. (Although I'm not advocating it, one syntax for RDF should be enough.)

ACSS is basically unchanged from CSS2, but by writing it as a module the rules for conformance will be easier to apply. In CSS2 already ACSS was implicitly usable on its own without the other parts, and vice versa, but in CSS3 the module will have a name and an explicit list of dependencies on other modules.

The box model is what underlies the visual properties of CSS. It describes the layout of a document as a set of nested and juxtaposed boxes, each with margins, borders, and other characteristics. The "formatting objects" of XSL are a superset of the boxes of CSS, although there are also more abstract objects in XSL, that represent multiple boxes, but are not themselves boxes.

The fonts module contains a model for fonts and an algorithm for selecting the right font for each character. It takes into account the fact that fonts may not be available on the reader's machine . A separate module, which depends on and extends the fonts module, is called WebFonts and describes a system for synthesizing and for downloading fonts, thus allowing "embedded" fonts in Web documents (even though strictly speaking, it is typically only the URL of the font that is embedded, and not the whole font file.)

Like ACSS, the fonts and WebFonts modules are basically unchanged from CSS2.

SVG is W3C's upcoming specification for vector graphics. It uses some of the existing CSS properties, but also introduces new properties specifically for styling graphic shapes. (In fact, some of the properties originally invented for graphics appear to be useful for text as well, and wil thus find a place in some othe rmodule. Transparency (or rather 'opacity') is an example.

In total, there will be about 20 modules that make up CSS3.

CSS and XSL - the goals

HTML was originally very easy to parse. You could use an SGML parser or something written with lex & yacc, but the addition of proprietary tags made the task much more daunting. Most of those proprietary extensions had to do with formatting, so it made sense to concentrate on CSS in order to save HTML from collapsing under its own weight. CSS has helped, but not enough, and to make development of HTML tools easier again, we are now developing a XHTML.

But now there is a need in the market place for more sophisticated manipulations of documents than what CSS was designed for (or what most people need). Rather than continuing to add more and more complex levels of CSS, it made sense to design a language specifically for those more difficult document manipulations and for the audience that would be capable of programming them.

Of course we now have a problem of public perception: two style languages both ratified by W3C, but that can be solved, and in fact, CSS will be the stronger once we can point people with ideas for extensions to XSL.

CSS and XSL - using them together

This diagram ss a nice bridge to the next presentation (on XSL). It shows the three ways for styling a document that are enabled by the CSS/XSL duo, or rather the CSS/CSLT/XSL-FO trio. If we forget the case of HTML versions 2-4, for which XSL is not suitable, we have the following three possibilities:

only CSS - if no transformation are necessary beyond the occasional insertion of an image or a canned phrase, then CSS by itself will be enough to style the document. This case is meant for document like HTML that are either written for human consumption or are generated by other means than XSLT (Perl, PHP, etc.)
XSLT and CSS - the styling can also be done in two steps: first a transformation with XSLT and then the application of a CSS style sheet to the resulting document. This is a variant of case 1, but now we assume that the transformation is done at the client side, by a UA that has signaled to the server that it understands XSLT.
XSLT and XSL-FO - rather than converting to a new document, XSLT can also transform directly to the abstract formatting objects (similar to the CSS boxes,as explained above). There is no separate style sheet that contains the properties, but rather the properties are hardcoded in the transformation. This, unfortunately, makes cascading (and thus the user's control) much harder, but for some jobs it may be easier this way. In particular, it may be necessary for certains styles (property values) that will not be available in CSS.

Cascading Style Sheets

CeBIT 2001, Hannover, Germany

PART 1 - intro to style & CSS

What is a Web style sheet?

The basics of CSS

The idea of a style sheet