4 CSS2 syntax and basic data types

Contents

  1. Syntax
    1. Tokenization
    2. Characters and case
    3. Statements
    4. At-rules
    5. Blocks
    6. Rule sets, declaration blocks, and selectors
    7. Declarations and properties
    8. Comments
  2. Rules for handling parsing errors
  3. Values
    1. Integers and real numbers
    2. Lengths
    3. Percentages
    4. URIs
    5. Colors
    6. Angles
    7. Times
    8. Frequencies
    9. Strings
  4. CSS embedded in HTML
  5. CSS as a stand-alone file
  6. Character escapes in CSS

4.1 Syntax

This section describes a grammar common to any version of CSS (including CSS2). Future versions of CSS will adhere to this core syntax, although they may add additional syntactic constraints.

These descriptions are normative. They are also complemented by the normative grammar rules presented in Appendix B.

4.1.1 Tokenization

All levels of CSS, level 1, level 2, but also any future levels, use the same core syntax. This allows UAs to parse (though not, of course, completely understand) style sheets written in levels of CSS that didn't exist at the time the UAs were created. Designers can use this feature to create style sheets that work with downlevel UA, while also exercising the possibilities of the latest levels of CSS.

CSS style sheets consist of a sequence of tokens. The list of tokens for CSS2 is as follows. The definitions use Lex-style regular expressions. Octal codes refer to [ISO10646] As in Lex, in case of multiple matches, the longest match determines the token.
Token Definition

IDENT {ident}
ATKEYWORD @{ident}
STRING {string}
HASH #{name}
NUMBER {num}
PERCENTAGE {num}%
DIMENSION {num}{ident}
URI url\({w}{string}{w}\)
|url\({w}([!#$%&*-~]|{nonascii}|{escape})*{w}\)
UNICODE-RANGE U\+[0-9A-F?]{1,6}(-[0-9A-F]{1,6})?
CDO \<!--
CDC -->
; ;
{ \{
} \}
( \(
) \)
[ \[
] \]
S [ \t\r\n\f]+
COMMENT \/\*[^*]*\*+([^/][^*]*\*+)*\/
FUNCTION {ident}\(
INCLUDES ~=
DELIM any other character

The macros in curly braces ({}) above are defined as follows:
Macro Definition

ident {nmstart}{nmchar}*
nmstart [a-zA-Z]|{nonascii}|{escape}
nonascii[^\0-\4177777]
unicode \\[0-9a-f]{1,6}
escape {unicode}|\\[ -~\200-\4177777]
nmchar [a-z0-9-]|{nonascii}|{escape}
num [0-9]+|[0-9]*\.[0-9]+
string {string1}|{string2}
string1 \"([\t !#$%&(-~]|\\\n|\'|{nonascii}|{escape})*\"
string2 \'([\t !#$%&(-~]|\\\n|\"|{nonascii}|{escape})*\'

Below is the core syntax for CSS. The sections that follow describe how to use it. Appendix B describes a more restrictive grammar that is closer to the CSS level 2 language.

stylesheet  : [ CDO | CDC | S | statement ]*;
statement   : ruleset | at-rule;
at-rule     : ATKEYWORD S* any* [ block | ';' S* ];
block       : '{' S* [ any | block | ATKEYWORD S* | ';' ]* '}' S*;
ruleset     : selector '{' S* declaration? [ ';' S* declaration ]* '}' S*;
selector    : any+;
declaration : property ':' S* value;
property    : IDENT S*;
value       : [ any | block | ATKEYWORD S* ]+;
any         : [ IDENT | NUMBER | PERCENTAGE | DIMENSION | STRING
              | DELIM | URI | HASH | UNICODE-RANGE | INCLUDES
              | '(' any* ')' | '[' any* ']' ] S*;

COMMENT tokens do not occur in the grammar (to keep it readable), but any number of these tokens may appear anywhere.

In some cases, user agents must "skip" part of an illegal style sheet. This specification defines skip to mean that the user agent parses the illegal string (from beginning to end), but then skips the string.

An identifier consists of letters, digits, hyphens, non-ASCII, and escaped characters.

4.1.2 Characters and case

The following rules always hold:

4.1.3 Statements

A CSS style sheet, for any version of CSS, consists of a list of statements (see the grammar above). There are two kinds of statements: at-rules and rule sets. There may be whitespace around the statements.

In this specification, the expressions "immediately before" or "immediate after" mean with no intervening white space or comments.

4.1.4 At-rules

At-rules start with an at-keyword, which is an identifier beginning with '@' (for example, '@import', '@page', etc.).

An at-rule consists of everything up to and including the next semicolon (;) or the next block, whichever comes first. A CSS UA that encounters an unrecognized at-rule must skip the whole of the @-rule and continue parsing after it.

CSS2 user agents have some additional constraints, e.g., they must also skip any '@import' rule that occurs inside a block or that doesn't precede all rule sets.

Here is an example. Assume a CSS2 parser encounters this style sheet:

  @import "subs.css";
  H1 { color: blue }
  @import "list.css";

The second '@import' is illegal according to CSS2. The CSS2 parser skips the whole at-rule, effectively reducing the style sheet to:

  @import "subs.css";
  H1 { color: blue }

In the following example, the second '@import' rule is invalid, since it occurs inside a '@media' block.

  @import "subs.css";
  @media print {
    @import "print-main.css";
    BODY { font-size: 10pt }
  }
  H1 {color: blue }

4.1.5 Blocks

A block starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there may be any characters, except that parentheses (()), brackets ([]) and braces ({}) must always occur in matching pairs and may be nested. Single (') and double quotes (") must also occur in matching pairs, and characters between them are parsed as a string. See Tokenization above for the definition of a string.

Here is an example of a block. Note that the right brace between the double quotes does not match the opening brace of the block, and that the second single quote is an escaped character, and thus doesn't match the first single quote:

  { causta: "}" + ({7} * '\'') }

Note that the above rule is not legal CSS2, but it is still a block as defined above.

4.1.6 Rule sets, declaration blocks, and selectors

A rule set consists of a selector followed by a declaration block.

A declaration-block (also called a {}-block in the following text) starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there must be a list of one or more semicolon-separated (;) declarations.

The selector (see also the section on selectors) consists of everything up to (but not including) the first left curly brace ({). A selector always goes together with a {}-block. When a UA can't parse the selector (i.e., it is not valid CSS2), it should skip the {}-block as well.

Note. CSS2 gives a special meaning to the comma (,) in selectors. However, since it is not known if the comma may acquire other meanings in future versions of CSS, the whole statement should be skipped if there is an error anywhere in the selector, even though the rest of the selector may look reasonable in CSS2.

For example, since the "&" is not a legal token in a CSS2 selector, a CSS2 UA must skip the whole second line, and not set the color of H3 to red:

H1, H2 {color: green }
H3, H4 & H5 {color: red }
H6 {color: black }

Here is a more complex example. The first two pairs of curly braces are inside a string, and do not mark the end of the selector. This is a legal CSS2 statement.

    P[example="public class foo
{
    private int x;

    foo(int x) {
        this.x = x;
    }

}"] { color: red }

4.1.7 Declarations and properties

A declaration is either empty or consists of a property, followed by a colon (:), followed by a value. Around each of these there may be whitespace.

Multiple declarations for the same selector may be organized into semicolon (;) separated groups.

Thus, the following rules:

  H1 { font-weight: bold }
  H1 { font-size: 12pt }
  H1 { line-height: 14pt }
  H1 { font-family: Helvetica }
  H1 { font-variant: normal }
  H1 { font-style: normal }

are equivalent to:

  H1 { 
    font-weight: bold; 
    font-size: 12pt;
    line-height: 14pt; 
    font-family: Helvetica; 
    font-variant: normal;
    font-style: normal;
  }

A property is an identifier. Any character may occur in the value, but parentheses (()), brackets ([]), braces ({}), single quotes (') and double quotes (") must come in matching pairs. Parentheses, brackets, and braces may be nested. Inside the quotes, characters are parsed as a string.

Values are specified separately for each property, but in any case are built from identifiers, strings, numbers, lengths, percentages, URIs, colors, angles, times, and frequencies.

To ensure that new properties and new values for existing properties can be added in the future, a UA must skip a declaration with an invalid property name or an invalid value. Every CSS2 property has its own syntactic and semantic restrictions on the values it accepts.

For example, assume a CSS2 parser encounters this style sheet:

  H1 { color: red; font-style: 12pt }  /* Invalid value: 12pt */
  P { color: blue;  font-vendor: any;  /* Invalid prop.: font-vendor */
      font-variant: small-caps }
  EM EM { font-style: normal }

The second declaration on the first line has an invalid value '12pt'. The second declaration on the second line contains an undefined property 'font-vendor'. The CSS2 parser will skip these declarations, effectively reducing the style sheet to:

  H1 { color: red; }
  P { color: blue;  font-variant: small-caps }
  EM EM { font-style: normal }

4.1.8 Comments

Comments begin with the characters "/*" and end with the characters "*/". They may occur anywhere where whitespace can occur and their contents have no influence on the rendering. Comments may not be nested.

CSS also allows the SGML comment delimiters ("<!--" and "-->") in certain places, but they do not delimit CSS comments. They are permitted so that style rules appearing in an HTML source document (in the STYLE element) may be hidden from pre-HTML3.2 user agents. See [HTML40] for more information.

4.2 Rules for handling parsing errors

User agents are required to obey the following rules when it encounters these parsing errors:

4.3 Values

4.3.1 Integers and real numbers

Some value types may have integer values (denoted by <integer>) or real number values (denoted by <number>). Real numbers and integers are specified in decimal notation only. An <integer> consists of one or more digits "0" to "9". A <number> can either be an <integer>, or it can be zero of more digits followed by a dot (.) followed by one or more digits. Both integers and real numbers may be preceded by a "-" or "+" to indicate the sign.

Note that many properties that allow an integer or real number as a value actually restrict the value to some range, often to a non-negative value.

4.3.2 Lengths

The format of a length value (denoted by <length> in this specification) is an optional sign character ('+' or '-', with '+' being the default) immediately followed by a <number> (with or without a decimal point) immediately followed by a unit identifier (e.g., px, deg, etc.). After the number '0', the unit identifier is optional.

Some properties allow negative length units, but this may complicate the formatting model and there may be implementation-specific limits. If a negative length value cannot be supported, it should be converted to the nearest value that can be supported.

There are two types of length units: relative and absolute. Relative length units specify a length relative to another length property. Style sheets that use relative units will more easily scale from one medium to another (e.g., from a computer display to a laser printer).

Relative units are: em, ex, and px.

  H1 { margin: 0.5em }      /* em: the height of the element's font */
  H1 { margin: 1ex }        /* ex: the height of the letter 'x' */
  P  { font-size: 12px }    /* px: pixels, relative to viewing device */

The 'em' unit, as used in CSS, is equal to the font size used when rendering an element's text. It may be used for vertical or horizontal measurement. The 'ex' unit is equal to the font's x-height (the height of the letter 'x') of the element's font. A font need not contain the letter "M" to have an 'em' size or the letter "x" to have an x-height; the font should still define the two units.

Both 'em' and 'ex' refer to the font size of an element except when used in the 'font-size' property, where they are relative to the font size inherited from the parent element.

The rule:

  H1 { line-height: 1.2em }

means that the line height of H1 elements will be 20% greater than the font size of the H1 elements. On the other hand:

  H1 { font-size: 1.2em }

means that the font-size of H1 elements will be 20% greater than the font size inherited by H1 elements.

When specified for the root of the document tree (e.g., HTML or BODY in HTML), 'em' and 'ex' refer to the property's initial value.

Please consult the section on line height calculations for more information about line heights in the visual flow model.

Pixel units are relative to the resolution of the viewing device, i.e., most often a computer display. If the pixel density of the output device is very different from that of a typical computer display, the UA should rescale pixel values. The suggested reference pixel is the visual angle of one pixel on a device with a pixel density of 90dpi and a distance from the reader of an arm's length. For a nominal arm's length of 28 inches, the visual angle is about 0.0227 degrees.

Child elements do not inherit the relative values specified for their parent; they inherit the computed values. For example:

  BODY {
    font-size: 12pt;
    text-indent: 3em;  /* i.e. 36pt */
  }
  H1 { font-size: 15pt } 

In these rules, the 'text-indent' value of H1 elements will be 36pt, not 45pt, if H1 is a child of the BODY element.

Absolute length units are only useful when the physical properties of the output medium are known. The absolute units are: in (inches), cm (centimeters), mm (millimeters), pt (points), and pc (picas).

For example:

  H1 { margin: 0.5in }      /* inches, 1in = 2.54cm */
  H2 { line-height: 3cm }   /* centimeters */
  H3 { word-spacing: 4mm }  /* millimeters */
  H4 { font-size: 12pt }    /* points, 1pt = 1/72 in */
  H4 { font-size: 1pc }     /* picas, 1pc = 12pt */

In cases where the specified length cannot be supported, UAs should try to approximate. For all CSS2 properties, further computations and inheritance should be based on the approximated value.

4.3.3 Percentages

The format of a percentage value (denoted by <percentage> in this specification) is an optional sign character ('+' or '-', with '+' being the default) immediately followed by a number immediately followed by '%'.

Percentage values are always relative to another value, for example a length. Each property that allows percentages also defines to which value the percentage refers. When a percentage value is set for a property of the root of the document tree and the percentage is defined as referring to the inherited value of some property X, the resultant value is the percentage times the initial value of property X.

Since child elements inherit the computed values of their parent, in the following example, the children of the P element will inherit a value of 12pt for 'line-height' (i.e., 12pt), not the percentage value (120%):

  P { font-size: 10pt }
  P { line-height: 120% }  /* relative to 'font-size', i.e. 12pt */

4.3.4 URIs

This specification uses the term Uniform Resource Identifier (URI) as defined in [URI] (see also [RFC1630]).

Note that URIs include URLs (as defined in [RFC1738] and [RFC1808]).

Relative URIs are resolved to full URIs using a base URI. [RFC1808], section 3, defines the normative algorithm for this process.

URI values in this specification are denoted by <uri>.

For historical reasons, the functional notation used to designate URI values is "url()".

For example:

  BODY { background: url(http://www.bg.com/pinkish.gif) }

The format of a URI value is 'url(' followed by optional whitespace followed by an optional single quote (') or double quote (") character followed by the URI itself, followed by an optional single quote (') or double quote (") character followed by optional whitespace followed by ')'. Quote characters that are not part of the URI itself must be balanced.

Parentheses, commas, whitespace characters, single quotes (') and double quotes (") appearing in a URI must be escaped with a backslash: '\(', '\)', '\,'.

In order to create modular style sheets that are not dependent on the absolute location of a resource, authors may specify the location of background images with partial URIs. Partial URIs (as defined in [RFC1808]) are interpreted relative to the base URI of the style sheet, not relative to the base URI of the source document.

For example, suppose the following rule is located in a style sheet designated by the URI http://www.myorg.org/style/basic.css:

  BODY { background: url(yellow) }

The background of the source document's BODY will be tiled with whatever image is described by the resource designated by the URI http://www.myorg.org/style/yellow.

User agents may vary in how they handle URIs that designate unavailable or inapplicable resources.

4.3.5 Colors

A <color> is either a keyword or a numerical RGB specification.

The suggested list of keyword color names is: aqua, black, blue, fuchsia, gray, green, lime, maroon, navy, olive, purple, red, silver, teal, white, and yellow. These 16 colors are taken from the Windows VGA palette, and their RGB values are not defined in this specification.

  BODY {color: black; background: white }
  H1 { color: maroon }
  H2 { color: olive }

The RGB color model is used in numerical color specifications. These examples all specify the same color:

  EM { color: #f00 }              /* #rgb */
  EM { color: #ff0000 }           /* #rrggbb */
  EM { color: rgb(255,0,0) }      /* integer range 0 - 255 */
  EM { color: rgb(100%, 0%, 0%) } /* float range 0.0% - 100.0% */

In addition to these color keywords, users may specify keywords that correspond to the colors used by certain objects in the user's environment. Please consult the section on system colors for more information.

The format of an RGB value in hexadecimal notation is a '#' immediately followed by either three or six hexadecimal characters. The three-digit RGB notation (#rgb) is converted into six-digit form (#rrggbb) by replicating pairs of digits, not by adding zeros. For example, #fb0 expands to #ffbb00. This makes sure that white (#ffffff) can be specified with the short notation (#fff) and removes any dependencies on the color depth of the display.

The format of an RGB value in the functional notation is 'rgb(' followed by a comma-separated list of three numerical values (either three integer values in the range of 0-255, or three percentage values, typically in the range of 0.0% to 100.0%) followed by ')'. Whitespace characters are allowed around the numerical values.

Values outside the device gamut should be clipped. For a device whose gamut is sRGB, the three rules below are equivalent:

  EM { color: rgb(255,0,0) }       /* integer range 0 - 255 */
  EM { color: rgb(300,0,0) }       /* clipped to 255 */
  EM { color: rgb(110%, 0%, 0%) }  /* clipped to 100% */

All RGB colors are specified in the sRGB color space (see [SRGB]). UAs may vary in the fidelity with which they represent these colors, but using sRGB provides an unambiguous and objectively measurable definition of what the color should be, which can be related to international standards (see [COLORIMETRY]).

Conforming UAs may limit their color-displaying efforts to performing a gamma-correction on them. sRGB specifies a display gamma of 2.2 under specified viewing conditions. UAs should adjust the colors given in CSS such that, in combination with an output device's "natural" display gamma, an effective display gamma of 2.2 is produced. See the section on gamma correction for further details. Note that only colors specified in CSS are affected; e.g., images are expected to carry their own color information.

4.3.6 Angles

Angle values (denoted by <angle> in the text) are used with aural cascading style sheets.

Their format is an optional sign character ('+' or '-', with '+' being the default) immediately followed by a <number> immediately followed by an angle unit identifier. After a '0' number, the unit identifier is optional.

These following are legal angle unit identifiers:

Angle values may be negative. They should be normalized to the range 0-360deg by the UA. For example, -10deg and 350deg are equivalent. The angle value must be followed immediately by the angle unit.

4.3.7 Times

Time values (denoted by <time> in the text) are used with aural cascading style sheets.

Their format is a <number> immediately followed by a time unit identifier. After a '0' number, the unit identifier is optional.

The following are legal time unit identifiers:

Time values may not be negative. The time value must be followed immediately by the time unit.

4.3.8 Frequencies

Frequency values (denoted by <frequency> in the text) are used with aural cascading style sheets.

Their format is a <number> immediately followed by a frequency unit identifier. After a '0' number, the unit identifier is optional.

There are two legal frequency unit identifiers:

For example, 200Hz (or 200hz) is a bass sound, and 6kHz (or 6khz) is a treble sound.

The frequency value must be followed immediately by the frequency unit.

4.3.9 Strings

Strings can either be written with double quotes or with single quotes. Double quotes cannot occur inside double quotes, unless escaped (as '\"' or as '\22'). Analogously for single quotes ("\'" or "\27"). Examples:

"this is a 'string'"
"this is a \"string\""
'this is a "string"'
'this is a \'string''

A string cannot directly contain a newline. To include a newline in a string, use the escape "\A" (hexadecimal A is the line feed character in Unicode, but represents the generic notion of "newline" in CSS). Sometimes it is safer to write "\00000A", since that will avoid the situation where the character following the "A" can be interpreted as a hexadecimal digit. For example, in the string

"A. one\AB. two"

the UA will see an escape sequence "\AB" («) instead of \A.

It is possible to break strings over several lines, for aesthetic or other reasons, but in such a case the newline itself has to be escaped with a "\". For instance, the following two selectors are exactly the same:

A[TITLE="a not s\
o very long title"] {border: double}
A[TITLE="a not so very long title"] {border: double}

4.4 CSS embedded in HTML

CSS style sheets may be embedded in HTML documents, and to be able to hide style sheets from older UAs, it is convenient put the style sheets inside HTML comments. Please consult [HTML40] for more information.

When CSS is embedded in HTML, it shares the charset parameter used to transmit the enclosing HTML document. As with HTML, the value of the charset parameter is used to convert from the transfer encoding to the document character set, which is specified by [ISO10646].

4.5 CSS as a stand-alone file

CSS style sheets may exist in files by themselves, being linked from the document. In this case, the CSS files are served with the media type text/css. As with all text media types, a charset parameter may be added which is used to convert from the transfer encoding to [ISO10646].

4.6 Character escapes in CSS

CSS may need to use characters that are outside the encoding used to transmit the document. For example, the "class" attribute of HTML allows more characters in a class name than the set allowed for selectors above. In CSS2, such characters can be escaped or written as [ISO10646] numbers.

For instance, "B&W?" may be written as "B\&W\?" or "B\26W\3F". For example, a document transmitted as ISO-8859-1 (Latin-1) cannot contain Greek letters directly: "κουρος" (Greek: "kouros") has to be written as "\3BA\3BF\3C5\3C1\3BF\3C2". These escapes are thus the CSS equivalent of numeric character references in HTML or XML documents.