W3C

Bert Bos | Selectors

Selectors

Cascading Style Sheets

Bert Bos (W3C) <bert@w3.org>

At: /* CSS Day */
Place: Amsterdam, Netherlands
Date: 14 June 2013

History

Knowing the goals & assumptions of CSS helps to understand the syntax and model

History: 1994 to mid 00's

Target audience:
everybody who can write HTML
Target styles:
books & articles with high quality typography but simple spatial arrangements
Target formats:
tree-structured with mostly text, mostly in reading order (such as HTML)

→ element types & nesting determine most of style

→ complex documents need another style sheet language (→ XSL)

To understand how the CSS selectors work and what the different kinds of punctuation mean, it helps to know the original and current goals of CSS. Although the selectors can be (and indeed are) used without CSS, they were designed to be easy and compact for applying certain kinds of style to certain kinds of documents. Extension mechanisms built in to the syntax allow other uses, but they aren't necessarily as compact or easy to read.

The high quality is a function of the implementation. We don't expect the author to be a typographer, he just selects some fonts and margins. The UAs task is to do the best it can with the author's hints. The reader's task is then to select the UA that satisfies his needs best. UAs may offer excellent typography that is suitable for printing, high speed, automatic table of content, transposing tables, advanced search, user style sheets, hypertext features, intra- and inter-document navigation, etc. CSS is just a style sheet language, not a typesetting system.

That typography follows the tree structure, even in simple documents, is a simplicfication that has many exceptions. We decided to ignore most of them. ::first-line and ::first-letter are some that we did not ignore. On the other hand, we ignored the rule in some typographical traditions (American, but not French) that punctuation should be in the same style as the word that precedes it. (One way to deal with this is to include a transformation step in the formatting, e.g., by means of XSLT.)

The limitation to simple layout is necessary, because complex layouts are almost certainly difficult to make, especially on the Web, where you don't know the reader's window size.

We hoped HTML would have a long life, at least 50, if not 100 years, but we couldn't be sure of that. It might be that CSS outlived HTML. And besides, there are other useful document formats (TEI, DocBook, etc.). And so CSS should not be bound too tightly to HTML, in as far that was possible without making it too difficult too use.

For complex documents, I expected one of two things to happen: either we would learn enough from CSS in a five or ten years to be able to make a language that was as easy to use, but also allowed complex layouts, or, more likely, we would make two new languages, an easy one for normal users and an advanced one suitable for complex documents.

What happened was that we made XSL (consisting of XSLT and XSL-FO). It had the right model, based on DSSSL, for complex layouts. But we never replaced CSS. And XSL, although very successful in the printing industry, never became popular for online or interactive documents. That had consequences for CSS, see below.

Document formats

CSS applicable roughly to formats that are

… not just SGML/XML/HTML5

(E.g., Qt uses CSS-derived style sheets)

(And not just for style: E.g., the Custom Properties module attaches arbitrary metadata to elements.)

Document formats

In other words:

This isn't a definition of the class of document formats that CSS can be applied to, just a set of heuristics to guide the design of the language.

History: now

Changing circumstances:

History: now

Consequences(?)

Target audience:
everybody who can write HTML professional designers and app developers?
Target styles:
books & articles with high quality typography but simple spatial arrangements high-speed, complex GUIs and high-quality typography for complex books and magazines?
Target formats:
tree-structured with mostly text, mostly in reading order (such as HTML) [unchanged]

For a while it looked that XSL could cover the needs of complex documents and it allowed us to refuse features that would have made CSS difficult to use.

But the lack of a standard language for describing GUIs on the one hand and the current lack of resources to develop XSL-FO 2 on the other has meant that the pressure on CSS has become high to add features for GUIs and for complex layout. The goal to keep CSS usable for the normal Web author seems to be largely abandoned. How to fix that situation is currently unknown.

Document model

The mark-up forms a tree

In SGML, XML and HTML5, the element type serves as the type of a node. The attributes are the attributes that CSS use, with any values normalized (as per SGML/XML rules) and represented as Unicode strings.

In HTML/HTML5, MathML and SVG, the classes are given by the class attribute. In other formats, they may be specified in other ways, or not exist.

In HTML/HTML5, MathML and SVG, the unique name is given by the ID attribute. In other formats it may be specified differently, or be absent.

The namespace wasn't part of the original model. It was added in 1999, when XML Namespaces were added to XML. As far as CSS is concerned, a namespace is an arbitrary string. It may be specified with the url() notation, because in XML it is a URL, but CSS never dereferences it and doesn't care whether it is a valid URL or not.

The named links are proposed (for level 4 of CSS) to correspond to IDREF attributes in SGML and XML (and the equivalent in HTML5). The reason is that the LABEL element in HTML uses an IDREF to link itself to an INPUT element and thus it would be nice to style the INPUT that belongs to a certain LABEL, or all LABELs the point to a certain INPUT. These slides do not treat this proposed feature.

Text nodes are leaf nodes. They have no further substructure and they cannot be selected with selectors. Of course, for the formatting part of CSS they do have some structure, even structure that may overlap the tree structure, in the form of lines, words and bidi-fragments. But that is outside the scope of these slides.

Document model

Note:

CSS doesn't parse, input is abstract

→ CSS doesn't see syntax errors

CSS is not like Perl or other tools to transform documents: it doesn't parse any text itself to create the parse tree. The input is abstract. It's not text, it's a tree, without any concrete representation.

CSS thus doesn't have to deal with concrete syntaxes or with parse errors. How a document is converted to a tree is out of scope.

What's not in the model

Selectors select elements (see below for “pseudo-elements”)

Not:

Note that XPath (and thus XSL) gives access to much more of the XML “Infoset.” CSS ignores most of SGML and XML and assumes its input is a simple tree. Such a tree can be made from an SGML or XML document in a fairly obvious way, but also from other kinds of formats.

In particular, the “fairly obvious way” includes expanding entities, ignoring processing instructions, ignoring comments, etc.

Attributes still play a role in selectors, although you cannot select them by themselves: they can be used to distinguish elements from one another.

CSS was never meant for all SGML documents, or to make use of all possible information in an SGML document. We only planned to use a subset of the information that SGML provided (see below).

I personally expected the Web to use only a subset of the capabilities of SGML, able to be parsed without the need for out-of-band information about the concrete syntax. I called my proposal for such a subset “SGML-lite.” (It wasn't a complete specification, it proposed some goals and a few alternatives for the concrete syntax.)

When later other people took the initiative to define such a format and make it a standard, under the name of XML, SGML-Lite was one of the inputs. Which meant that CSS could support XML right away. Only XML Namespaces, which were added to XML a little later, had not been foreseen and required an addition to the CSS model (see below).

Information from outside the tree

We have ideas for relying less on magic for the document semantics. E.g., we could use XLink, HLink, or CSS properties to indicate which elements are links (hyperlinks or tranclusions, such as images). But then selectors such as ':link' become difficult to define…

Selecting pseudo-elements

More complex typography → more style that does not follow the document tree → more “pseudo-elements” and “at-rules”

and proposed:

Tree-based selectors (1)

Selector Description Level
* any element (universal selector) 2
E element of type E (note: namespace) 1
#D element with unique ID D 1
.C element with class C 1
XY (no space between X and Y) conjunction 1
E F an element F descendant of an element E1
E > F an element F child of an element E 2
E + F an element F immediately after sibling E 2
E ~ F an element F after sibling E 3

Apart from '~' these are all old and well-known, so I'll not say any more about them here. The '~' is the generalization of the '>': not just the immediately following sibling, but any following sibling.

For '>' (and for the pseudo-classes in the next slides), only elements are counted. Intervening text nodes do not matter. Thus the EM is the immediately following sibling of the SPAN in … <span>word</span> between <em>words</em>…

In most modern programming languages, white space is not significant, other than that it is sometimes necessary to separate tokens: if true then needs spaces, because iftruethen would be a single token. But otherwise you can omit it: a := b + 7 is the same as a:=b+7.

Not so in CSS selectors: H2.sub is a conjunction (an element with type H2 and class sub), while H2 .sub is a descendant selector (an element with class sub that is a descendant of an element of type H2). Programmers often complain about this.

But it was a conscious decision: there are far fewer programmers that other people…

(There are other places in CSS where the syntax doesn't follow recent tradition of programming languages: font-family accepts font names with and without quotes, and the white space is added in the obvious way. And grid templates also mix quotes strings and bare identifiers: flow: c refers to slot c in grid: "a b c".)

Demo sibling elements

E ~ F
an element F that has elder sibling E

<div>
  <h1 id=french>French...</h1>
  <p>...
  <p>...
  <p>...
</div>
<div>
  <h1 id=english>English...</h1>
  <p>...
  <p>...
  <p>...
</div>
#french ~ p {
  font-weight: bold;
  color: #077;
  font-size: 1.4em;
  padding-left :1em;
}

Tree-based selectors (2)

Selector Description Level
[foo] element with a “foo” attribute 2
[foo="bar"] element whose “foo” attribute value is “bar” 2
[foo~="bar"] element whose “foo” attribute value is a list of whitespace-separated values one of which is “bar” 2
[foo^="bar"] element whose “foo” attribute value begins with “bar” 3
[foo$="bar"] element whose “foo” attribute value ends in “bar” 3
[foo*="bar"] element whose “foo” attribute value contains “bar” 3
[foo|="en"] element whose “foo” attribute is a hyphen-separated list beginning with “en” 2

Quotes may be " or ' (or omitted, if the value is an identifier)

Demo attribute selectors

E[foo]
an element E that has an attribute called foo

img[alt] {
  border: 10px rgb(147,103,94) solid;
  border-radius: 10px;
}

Demo attribute contains text

E[foo*="bar"]
an element E with an attribute foo that contains the text “bar”

a[href*="glossaire"] {
  color: gray;
}

Demo attribute begins with text

E[foo^="bar"]
an element E with an attribute foo whose value starts with the text “bar”

a[href^="http:"] {
  color:#6996d3;
  padding-left:1em;
  text-decoration:none;
}

Demo attribute ends with text

E[foo$="bar"]
an element E that has an attribute foo whose value ends with the text “bar”

a[href$=".pdf"]::after {
  content:url(pictopdf.gif)
  vertical-align: -20px;
}

Tree-based selectors (3)

Selector Description Level
:root root element of the document 3
:first-child first child of its parent 2
:last-child last child of its parent 3
:first-of-type first child of its type 3
:last-of-type last child of its type 3
:only-child only child of its parent 3
:only-of-type only child its type 3
:empty element with no children (incl. text nodes) 3

Demo first child

E:first-child
an element E that is the first child of its parent

th:first-child {
  border-left: none;
  text-align: left }
th:nth-child(1) {            /* same! */
  border-left: none;
  text-align: left }

nth-child is explained below.

Example for first-of-type

Assume a DL:

DD:first-of-type {margin-top: 0}

Tree-based selectors (4)

Selector Description Level
:nth-child(n) n-th child of its parent 3
:nth-last-child(n) n-th child of its parent counting from the last 3
:nth-of-type(n) n-th sibling of its type 3
:nth-last-of-type(n) n-th sibling of its type counting from the last 3

nth-of-type is typically used together with a type selector: dd:nth-of-type(2)

The n'th child / every n'th child (1/2)

*:nth-child(2n+4)
every second child starting with the 4th (i.e., children 4, 6, 8, 10, etc.)

td:nth-child(2n + 4) {
  background:rgba(29, 53, 91, .3);
}

Note: no space between the “2” and the “n”!

The n'th child / every n'th child (2/2)

The n in an + b stands for all whole numbers ≥ 0 (i.e., 0, 1, 2,…)

If a or b is 0, you can omit that part:

Negative numbers are allowed (and sometimes useful):

Even and odd

E:nth-child(odd)
same as nth-child(2n+1)

E:nth-child(even)
same as nth-child(2n+2)

td:nth-child(odd){
  background:orange;
}

Tree-based selectors (5)

Selector Description Level
X|E element E that has a namespace X 3
|E element E that has no namespace 3
*|E element E with or without a namespace 3
E if the style sheet has a default namespace:
element E in that namespace
3

Some missing tree-based selectors

Some attributes may be defined as case-insensitive by the document format, but for those that are not, it may still be useful to match them case-insensitively, e.g., if they contain human-readable text.

These missing selectors are missing in part because they are difficult for the average user. Until we have a new, easy-to-use style sheet language, we may want to hold off from adding these.

External information (1)

Semantic information about the document format & information about the user

Selector Description Level
:link a hypertext source anchor not recently traversed 1
:visited a hypertext source anchor recently traversed 1
:enabled a form control that is enabled 3
:disabled a form control that is disabled 3

External information (2)

User interaction

Selector Description Level
:active form element or hyperlink during user interaction 1
:hover element currently under the (mouse) pointer 2
:focus form element that currently has keyboard focus 2

External information (3)

Information about the document

Selector Description Level
:target element that is the target of the referring URI 3
:lang(fr) element in language “fr” 2

Language can come, e.g., from protocol headers and be overridden by attributes (such as xml:lang)

The target of the current URL

:target
the element that is pointed at by the current URL, e.g..:

http://example.org/doc#fragment

From the demo:

:target { border:4px gold solid }

If you follow a link to somewhere in the middle of another document, then you can give the element that you jumped to a special style by means of ':target'.

Of course, you can also jump within the same document, e.g., from the table of contents to a section. Some tricks rely on jumping within a document to change the style of the document with every click, such as showing and hiding tabbed cards. (But hopefully one day we'll have ways to do such style changes directly, without the limitations of this trick.)

External information (4)

Document semantics + user interaction

Selector Description Level
:checked a user interface element that is checked (radio-button, checkbox) 3

Typographical elements

Some regions that do not correspond to elements

Selector Description Level
::first-line the first formatted line of an element 1
::first-letter the first formatted letter of an element 1
::value the box with the value of a text field in a form 3
::choices the box with the list of options in a menu in a form 3
::repeat-item an element in a form that the user can make multiple copies of 3
::repeat-index ditto, but which is the "current" element in the list 3

Some missing selectors for typographical regions

External information (5)

External information about the document format

Selector Description Level
.warning an element that belongs to class “warning” 1

(the document format specifies how class is determined)

Logic

Selector Description Level
:not(s) an element that does not match simple selector s 3

CSS selectors don't have fully general boolean logic. There is a top-level OR (the comma) only. And the NOT only applies to the simple selectors. There is a proposal for level 4 to allow a kind of parentheses to have an OR inside a selector, and even inside a NOT.

Demo negation selectors 1

E:not(s)
an element E that doesn't match s

img:not([alt]) {
  border: 10px rgb(147,103,94) solid;
  border-radius: 10px;
}

Demo negation selectors 2

A UK flag flag in front of every LI element that is in English when its parent is not:

:not(:lang(en)) > li[lang|=en]:before {
 content: url(de) " " }

Negative selectors are difficult to use, especially in a contextual selector (a selector that includes ancestors, siblings or descendants). Often the easier way is to style all except for a specific one. But sometimes that is not possible, and the :not() is the only way.

The example is one such. It selects elements in German of which the parent is not in German. Listing all possible languages that are not German is impossible if you do not know in advance which languages are being used. Which is the case here, because this is from a stye sheet for a set of pages with a growing number of translations. Adding a style rule whenever a new translation is added would be tedious.

Note also how this uses both the :lang() selector (to match the parent, whose language need not come from an attribute, but could be inherited), and the |= operator to check the LANG attribute. In this case, the [lang|=de] could actually have been :lang(de) as well.

Some missing logic

Currently only top-level OR and limited AND

Level 4 proposes a local OR

An exercise…

How would you make a document that allows you to switch which of two texts is visible?

Hint: you can do this with :target and ~, or with :checked and ~ (What is the advantage of :checked over :target in this case?)

Answer

:checked obviously needs a checkbox in the document, :target only needs a link (which can be made to look like a button, e.g.)

But :target has the disadvantage that every activation is added to the history. Going back then doesn't go back to the previous document, but reopens or closes the text.

Both have to be preceding siblings of the text to hide (or of an ancestor of the text to hide), because in level 3 there are no selectors that can go “back up” the tree.

The end

http://www.w3.org/Talks/2013/0614-CSS-Amsterdam

W3C

Om het volle potentieel van het Web te ontwikkelen

Om de trends voor te zijn

Om de waarde van uw bedrijf te vergroten

Word lid van W3C

http://www.w3.org/Consortium/join

of neem contact op: Annette Kik <w3c-benelux@w3.org>

W3C

Pour Développer le potentiel du Web

Pour Anticiper les évolutions technologiques

Pour Augmenter la valeur et visibilité de votre organisation

Rejoigner le W3C

http://www.w3.org/Consortium/join

ou contacter : Bernard Gidon <bgidon@w3.org>

W3C

To Lead the Web to its full potential

To Anticipate the Trends

To Increase your company value

Join W3C

http://www.w3.org/Consortium/join

or contact: Bernard Gidon <bgidon@w3.org>

Bert Bos <bert@w3.org>
GPG fingerprint: 7744 0204 52A5 14D9 147D
2A13 2D7A E420 184B 5BA4