Bert Bos | Selectors

Selectors

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Cascading Style Sheets

Bert Bos (W3C) <bert@w3.org>

At: /* CSS Day */
Place: Amsterdam, Netherlands
Date: 14 June 2013

History

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Knowing the goals & assumptions of CSS helps to understand the syntax and model

History: 1994 to mid 00's

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Target audience:: everybody who can write HTML
Target styles:: books & articles with high quality typography but simple spatial arrangements
Target formats:: tree-structured with mostly text, mostly in reading order (such as HTML)

→ element types & nesting determine most of style

→ complex documents need another style sheet language (→ XSL)

To understand how the CSS selectors work and what the different kinds of punctuation mean, it helps to know the original and current goals of CSS. Although the selectors can be (and indeed are) used without CSS, they were designed to be easy and compact for applying certain kinds of style to certain kinds of documents. Extension mechanisms built in to the syntax allow other uses, but they aren't necessarily as compact or easy to read.

The high quality is a function of the implementation. We don't expect the author to be a typographer, he just selects some fonts and margins. The UAs task is to do the best it can with the author's hints. The reader's task is then to select the UA that satisfies his needs best. UAs may offer excellent typography that is suitable for printing, high speed, automatic table of content, transposing tables, advanced search, user style sheets, hypertext features, intra- and inter-document navigation, etc. CSS is just a style sheet language, not a typesetting system.

That typography follows the tree structure, even in simple documents, is a simplicfication that has many exceptions. We decided to ignore most of them. ::first-line and ::first-letter are some that we did not ignore. On the other hand, we ignored the rule in some typographical traditions (American, but not French) that punctuation should be in the same style as the word that precedes it. (One way to deal with this is to include a transformation step in the formatting, e.g., by means of XSLT.)

The limitation to simple layout is necessary, because complex layouts are almost certainly difficult to make, especially on the Web, where you don't know the reader's window size.

We hoped HTML would have a long life, at least 50, if not 100 years, but we couldn't be sure of that. It might be that CSS outlived HTML. And besides, there are other useful document formats (TEI, DocBook, etc.). And so CSS should not be bound too tightly to HTML, in as far that was possible without making it too difficult too use.

For complex documents, I expected one of two things to happen: either we would learn enough from CSS in a five or ten years to be able to make a language that was as easy to use, but also allowed complex layouts, or, more likely, we would make two new languages, an easy one for normal users and an advanced one suitable for complex documents.

What happened was that we made XSL (consisting of XSLT and XSL-FO). It had the right model, based on DSSSL, for complex layouts. But we never replaced CSS. And XSL, although very successful in the printing industry, never became popular for online or interactive documents. That had consequences for CSS, see below.

Document formats

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

CSS applicable roughly to formats that are

tree-structured
mostly text
mostly in reading order

… not just SGML/XML/HTML5

(E.g., Qt uses CSS-derived style sheets)

(And not just for style: E.g., the Custom Properties module attaches arbitrary metadata to elements.)

Document formats

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

In other words:

the raw text already makes some sense
a few line breaks already make it usable
some indents already give it structure
some font styles and it already looks like an article
with some images and minimal spatial arrangement it is good enough for printing

This isn't a definition of the class of document formats that CSS can be applied to, just a set of heuristics to guide the design of the language.

History: now

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Changing circumstances:

XSL successful for print, not for online or interactive
XSL-FO 2 uncertain → publishers hope for CSS extensions instead
no standard language for GUIs → have to use HTML+CSS

History: now

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Consequences(?)

Target audience:: everybody who can write HTML professional designers and app developers?
Target styles:: ~~books & articles with high quality typography but simple spatial arrangements~~ high-speed, complex GUIs and high-quality typography for complex books and magazines?
Target formats:: tree-structured with mostly text, mostly in reading order (such as HTML) [unchanged]

For a while it looked that XSL could cover the needs of complex documents and it allowed us to refuse features that would have made CSS difficult to use.

But the lack of a standard language for describing GUIs on the one hand and the current lack of resources to develop XSL-FO 2 on the other has meant that the pressure on CSS has become high to add features for GUIs and for complex layout. The goal to keep CSS usable for the normal Web author seems to be largely abandoned. How to fix that situation is currently unknown.

Document model

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

The mark-up forms a tree

tree of ordered nodes:
- element nodes, which have:
  - a type
  - a set of zero or more, unordered attributes
    - name
    - string value
  - a set of zero or more classes
  - optional unique name
  - optional namespace added in 1999
  - zero or more named links proposed
- text nodes (Unicode text, as leaf nodes only)
outside information (see below)

In SGML, XML and HTML5, the element type serves as the type of a node. The attributes are the attributes that CSS use, with any values normalized (as per SGML/XML rules) and represented as Unicode strings.

In HTML/HTML5, MathML and SVG, the classes are given by the class attribute. In other formats, they may be specified in other ways, or not exist.

In HTML/HTML5, MathML and SVG, the unique name is given by the ID attribute. In other formats it may be specified differently, or be absent.

The namespace wasn't part of the original model. It was added in 1999, when XML Namespaces were added to XML. As far as CSS is concerned, a namespace is an arbitrary string. It may be specified with the url() notation, because in XML it is a URL, but CSS never dereferences it and doesn't care whether it is a valid URL or not.

The named links are proposed (for level 4 of CSS) to correspond to IDREF attributes in SGML and XML (and the equivalent in HTML5). The reason is that the LABEL element in HTML uses an IDREF to link itself to an INPUT element and thus it would be nice to style the INPUT that belongs to a certain LABEL, or all LABELs the point to a certain INPUT. These slides do not treat this proposed feature.

Text nodes are leaf nodes. They have no further substructure and they cannot be selected with selectors. Of course, for the formatting part of CSS they do have some structure, even structure that may overlap the tree structure, in the form of lines, words and bidi-fragments. But that is outside the scope of these slides.

Document model

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Note:

CSS doesn't parse, input is abstract

→ CSS doesn't see syntax errors

CSS is not like Perl or other tools to transform documents: it doesn't parse any text itself to create the parse tree. The input is abstract. It's not text, it's a tree, without any concrete representation.

CSS thus doesn't have to deal with concrete syntaxes or with parse errors. How a document is converted to a tree is out of scope.

What's not in the model

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selectors select elements (see below for “pseudo-elements”)

Not:

letters
comments
entities
processing instructions
attributes
DTD subsets
syntax details (spaces, delimiters such as <)
syntax errors

Note that XPath (and thus XSL) gives access to much more of the XML “Infoset.” CSS ignores most of SGML and XML and assumes its input is a simple tree. Such a tree can be made from an SGML or XML document in a fairly obvious way, but also from other kinds of formats.

In particular, the “fairly obvious way” includes expanding entities, ignoring processing instructions, ignoring comments, etc.

Attributes still play a role in selectors, although you cannot select them by themselves: they can be used to distinguish elements from one another.

CSS was never meant for all SGML documents, or to make use of all possible information in an SGML document. We only planned to use a subset of the information that SGML provided (see below).

I personally expected the Web to use only a subset of the capabilities of SGML, able to be parsed without the need for out-of-band information about the concrete syntax. I called my proposal for such a subset “SGML-lite.” (It wasn't a complete specification, it proposed some goals and a few alternatives for the concrete syntax.)

When later other people took the initiative to define such a format and make it a standard, under the name of XML, SGML-Lite was one of the inputs. Which meant that CSS could support XML right away. Only XML Namespaces, which were added to XML a little later, had not been foreseen and required an addition to the CSS model (see below).

Information from outside the tree

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

The rendering surface (size, colors…)
Document semantics (hyperlinks, replaced objects…)
History (visited)
User interaction (focus, checked, active, hover…)
Etc.

We have ideas for relying less on magic for the document semantics. E.g., we could use XLink, HLink, or CSS properties to indicate which elements are links (hyperlinks or tranclusions, such as images). But then selectors such as ':link' become difficult to define…

Selecting pseudo-elements

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

More complex typography → more style that does not follow the document tree → more “pseudo-elements” and “at-rules”

::first-line, ::first-letter
pages, running headers (@page, @top)
form control parts (::value, ::choices)

and proposed:

list markers, footnote markers (::marker)
templates/regions (::slot(), ::column(), @region)

Tree-based selectors (1)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selector	Description	Level
`*`	any element (universal selector)	2
`E`	element of type E (note: namespace)	1
`#D`	element with unique ID D	1
`.C`	element with class C	1
`XY`	(no space between X and Y) conjunction	1
`E F`	an element F descendant of an element E	1
`E > F`	an element F child of an element E	2
`E + F`	an element F immediately after sibling E	2
`E ~ F`	an element F after sibling E	3

Apart from '~' these are all old and well-known, so I'll not say any more about them here. The '~' is the generalization of the '>': not just the immediately following sibling, but any following sibling.

For '>' (and for the pseudo-classes in the next slides), only elements are counted. Intervening text nodes do not matter. Thus the EM is the immediately following sibling of the SPAN in … <span>word</span> between <em>words</em>…

In most modern programming languages, white space is not significant, other than that it is sometimes necessary to separate tokens: if true then needs spaces, because iftruethen would be a single token. But otherwise you can omit it: a := b + 7 is the same as a:=b+7.

Not so in CSS selectors: H2.sub is a conjunction (an element with type H2 and class sub), while H2 .sub is a descendant selector (an element with class sub that is a descendant of an element of type H2). Programmers often complain about this.

But it was a conscious decision: there are far fewer programmers that other people…

(There are other places in CSS where the syntax doesn't follow recent tradition of programming languages: font-family accepts font names with and without quotes, and the white space is added in the obvious way. And grid templates also mix quotes strings and bare identifiers: flow: c refers to slot c in grid: "a b c".)

Demo sibling elements

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E ~ F
an element F that has elder sibling E

<div>
  <h1 id=french>French...</h1>
  <p>...
  <p>...
  <p>...
</div>
<div>
  <h1 id=english>English...</h1>
  <p>...
  <p>...
  <p>...
</div>

#french ~ p {
  font-weight: bold;
  color: #077;
  font-size: 1.4em;
  padding-left :1em;
}

Tree-based selectors (2)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selector	Description	Level
`[foo]`	element with a “foo” attribute	2
`[foo="bar"]`	element whose “foo” attribute value is “bar”	2
`[foo~="bar"]`	element whose “foo” attribute value is a list of whitespace-separated values one of which is “bar”	2
`[foo^="bar"]`	element whose “foo” attribute value begins with “bar”	3
`[foo$="bar"]`	element whose “foo” attribute value ends in “bar”	3
`[foo*="bar"]`	element whose “foo” attribute value contains “bar”	3
`[foo\|="en"]`	element whose “foo” attribute is a hyphen-separated list beginning with “en”	2

Quotes may be " or ' (or omitted, if the value is an identifier)

Demo attribute selectors

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E[foo]
an element E that has an attribute called foo

img[alt] {
  border: 10px rgb(147,103,94) solid;
  border-radius: 10px;
}

Demo attribute contains text

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E[foo*="bar"]
an element E with an attribute foo that contains the text “bar”

a[href*="glossaire"] {
  color: gray;
}

Demo attribute begins with text

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E[foo^="bar"]
an element E with an attribute foo whose value starts with the text “bar”

a[href^="http:"] {
  color:#6996d3;
  padding-left:1em;
  text-decoration:none;
}

Demo attribute ends with text

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E[foo$="bar"]
an element E that has an attribute foo whose value ends with the text “bar”

a[href$=".pdf"]::after {
  content:url(pictopdf.gif)
  vertical-align: -20px;
}

Tree-based selectors (3)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selector	Description	Level
`:root`	root element of the document	3
`:first-child`	first child of its parent	2
`:last-child`	last child of its parent	3
`:first-of-type`	first child of its type	3
`:last-of-type`	last child of its type	3
`:only-child`	only child of its parent	3
`:only-of-type`	only child its type	3
`:empty`	element with no children (incl. text nodes)	3

Demo first child

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E:first-child
an element E that is the first child of its parent

th:first-child {
  border-left: none;
  text-align: left }
th:nth-child(1) {            /* same! */
  border-left: none;
  text-align: left }

nth-child is explained below.

Example for first-of-type

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Assume a DL:

DTs are run-in
DT {display: run-in}
some margin between the DDs
DD {margin: 0.5em 0}
no margin above and below
DL {margin: 0}
how to suppress the margin before the first DD?
(there maye be zero or more DTs before it)

DD:first-of-type {margin-top: 0}

Tree-based selectors (4)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selector	Description	Level
`:nth-child(n)`	n-th child of its parent	3
`:nth-last-child(n)`	n-th child of its parent counting from the last	3
`:nth-of-type(n)`	n-th sibling of its type	3
`:nth-last-of-type(n)`	n-th sibling of its type counting from the last	3

nth-of-type is typically used together with a type selector: dd:nth-of-type(2)

The n'th child / every n'th child (1/2)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

*:nth-child(2n+4)
every second child starting with the 4th (i.e., children 4, 6, 8, 10, etc.)

td:nth-child(2n + 4) {
  background:rgba(29, 53, 91, .3);
}

Note: no space between the “2” and the “n”!

The n'th child / every n'th child (2/2)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

The n in an + b stands for all whole numbers ≥ 0 (i.e., 0, 1, 2,…)

If a or b is 0, you can omit that part:

:nth-child(0n+7) → :nth-child(7)
:nth-child(3n+0) → :nth-child(3n)

Negative numbers are allowed (and sometimes useful):

:nth-child(2n-5) same as :nth-child(2n+1)
:nth-child(-n+3) only children 1, 2 and 3 (!)

Even and odd

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E:nth-child(odd)
same as nth-child(2n+1)

E:nth-child(even)
same as nth-child(2n+2)

td:nth-child(odd){
  background:orange;
}

Tree-based selectors (5)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selector	Description	Level
`X\|E`	element E that has a namespace X	3
`\|E`	element E that has no namespace	3
`*\|E`	element E with or without a namespace	3
`E`	if the style sheet has a default namespace: element E in that namespace	3

Some missing tree-based selectors

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

parent (proposed in level 4)
ancestor (proposed in level 4)
immediately preceding sibling (proposed in level 4)
preceding sibling (proposed in level 4)
content
case-insensitive attribute matching
regular-expression-based attribute matching
linked element (IDREF, proposed in level 4)

Some attributes may be defined as case-insensitive by the document format, but for those that are not, it may still be useful to match them case-insensitively, e.g., if they contain human-readable text.

These missing selectors are missing in part because they are difficult for the average user. Until we have a new, easy-to-use style sheet language, we may want to hold off from adding these.

External information (1)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Semantic information about the document format & information about the user

Selector	Description	Level
`:link`	a hypertext source anchor not recently traversed	1
`:visited`	a hypertext source anchor recently traversed	1
`:enabled`	a form control that is enabled	3
`:disabled`	a form control that is disabled	3

External information (2)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

User interaction

Selector	Description	Level
`:active`	form element or hyperlink during user interaction	1
`:hover`	element currently under the (mouse) pointer	2
`:focus`	form element that currently has keyboard focus	2

External information (3)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Information about the document

Selector	Description	Level
`:target`	element that is the target of the referring URI	3
`:lang(fr)`	element in language “fr”	2

Language can come, e.g., from protocol headers and be overridden by attributes (such as xml:lang)

The target of the current URL

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

:target
the element that is pointed at by the current URL, e.g..:

http://example.org/doc#fragment

From the demo:

:target { border:4px gold solid }

If you follow a link to somewhere in the middle of another document, then you can give the element that you jumped to a special style by means of ':target'.

Of course, you can also jump within the same document, e.g., from the table of contents to a section. Some tricks rely on jumping within a document to change the style of the document with every click, such as showing and hiding tabbed cards. (But hopefully one day we'll have ways to do such style changes directly, without the limitations of this trick.)

External information (4)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Document semantics + user interaction

Selector	Description	Level
:checked	a user interface element that is checked (radio-button, checkbox)	3

Typographical elements

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Some regions that do not correspond to elements

Selector	Description	Level
`::first-line`	the first formatted line of an element	1
`::first-letter`	the first formatted letter of an element	1
`::value`	the box with the value of a text field in a form	3
`::choices`	the box with the list of options in a menu in a form	3
`::repeat-item`	an element in a form that the user can make multiple copies of	3
`::repeat-index`	ditto, but which is the "current" element in the list	3

Some missing selectors for typographical regions

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

regions in a template (::slot(x), ::column(n))
special regions (::footnote-call, ::footnote-marker, ::marker)
table selectors (::nth-column(), ::nth-last-column())
blank elements (either :empty or nothing left after formatting, in particular due to 'white-space')
time-based (for dynamic rendering, such as projection and speech: :current, :past, :future)

External information (5)

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

External information about the document format

Selector	Description	Level
`.warning`	an element that belongs to class “warning”	1

(the document format specifies how class is determined)

Logic

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Selector	Description	Level
`:not(s)`	an element that does not match simple selector `s`	3

CSS selectors don't have fully general boolean logic. There is a top-level OR (the comma) only. And the NOT only applies to the simple selectors. There is a proposal for level 4 to allow a kind of parentheses to have an OR inside a selector, and even inside a NOT.

Demo negation selectors 1

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

E:not(s)
an element E that doesn't match s

img:not([alt]) {
  border: 10px rgb(147,103,94) solid;
  border-radius: 10px;
}

Demo negation selectors 2

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

A UK flag flag in front of every LI element that is in English when its parent is not:

:not(:lang(en)) > li[lang|=en]:before {
 content: url(de) " " }

Negative selectors are difficult to use, especially in a contextual selector (a selector that includes ancestors, siblings or descendants). Often the easier way is to style all except for a specific one. But sometimes that is not possible, and the :not() is the only way.

The example is one such. It selects elements in German of which the parent is not in German. Listing all possible languages that are not German is impossible if you do not know in advance which languages are being used. Which is the case here, because this is from a stye sheet for a set of pages with a growing number of translations. Adding a style rule whenever a new translation is added would be tedious.

Note also how this uses both the :lang() selector (to match the parent, whose language need not come from an attribute, but could be inherited), and the |= operator to check the LANG attribute. In this case, the [lang|=de] could actually have been :lang(de) as well.

Some missing logic

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

Currently only top-level OR and limited AND

Level 4 proposes a local OR

An exercise…

●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●

How would you make a document that allows you to switch which of two texts is visible?

Hint: you can do this with :target and ~, or with :checked and ~ (What is the advantage of :checked over :target in this case?)

Answer

:checked obviously needs a checkbox in the document, :target only needs a link (which can be made to look like a button, e.g.)

But :target has the disadvantage that every activation is added to the history. Going back then doesn't go back to the previous document, but reopens or closes the text.

Both have to be preceding siblings of the text to hide (or of an ancestor of the text to hide), because in level 3 there are no selectors that can go “back up” the tree.