4 Syntax and basic data types

Contents

4.1 Syntax

This section describes a grammar (and forward-compatible parsing rules) common to any level of CSS (including CSS 2.1). Future updates of CSS will adhere to this core syntax, although they may add additional syntactic constraints.

These descriptions are normative. They are also complemented by the normative grammar rules presented in Appendix G.

In this specification, the expressions "immediately before" or "immediately after" mean with no intervening whitespace or comments.

4.1.1 Tokenization

All levels of CSS — level 1, level 2, and any future levels — use the same core syntax. This allows UAs to parse (though not completely understand) style sheets written in levels of CSS that didn't exist at the time the UAs were created. Designers can use this feature to create style sheets that work with older user agents, while also exercising the possibilities of the latest levels of CSS.

At the lexical level, CSS style sheets consist of a sequence of tokens. The list of tokens for CSS is as follows. The definitions use Lex-style regular expressions. Octal codes refer to ISO 10646 ([ISO10646]). As in Lex, in case of multiple matches, the longest match determines the token.

Token Definition

IDENT {ident}
ATKEYWORD @{ident}
STRING {string}
INVALID {invalid}
HASH #{name}
NUMBER {num}
PERCENTAGE {num}%
DIMENSION {num}{ident}
URI url\({w}{string}{w}\)
|url\({w}([!#$%&*-~]|{nonascii}|{escape})*{w}\)
UNICODE-RANGE U\+[0-9a-f?]{1,6}(-[0-9a-f]{1,6})?
CDO <!--
CDC -->
; ;
{ \{
} \}
( \(
) \)
[ \[
] \]
S [ \t\r\n\f]+
COMMENT \/\*[^*]*\*+([^/*][^*]*\*+)*\/
FUNCTION {ident}\(
INCLUDES ~=
DASHMATCH |=
DELIM any other character not matched by the above rules, and neither a single nor a double quote

The macros in curly braces ({}) above are defined as follows:

Macro Definition

ident [-]?{nmstart}{nmchar}*
name {nmchar}+
nmstart [_a-z]|{nonascii}|{escape}
nonascii[^\0-\177]
unicode \\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?
escape {unicode}|\\[^\n\r\f0-9a-f]
nmchar [_a-z0-9-]|{nonascii}|{escape}
num [0-9]+|[0-9]*\.[0-9]+
string {string1}|{string2}
string1 \"([^\n\r\f\\"]|\\{nl}|{escape})*\"
string2 \'([^\n\r\f\\']|\\{nl}|{escape})*\'
invalid {invalid1}|{invalid2}
invalid1\"([^\n\r\f\\"]|\\{nl}|{escape})*
invalid2\'([^\n\r\f\\']|\\{nl}|{escape})*
nl \n|\r\n|\r|\f
w [ \t\r\n\f]*

Below is the core syntax for CSS. The sections that follow describe how to use it. Appendix G describes a more restrictive grammar that is closer to the CSS level 2 language. Parts of style sheets that can be parsed according to this grammar but not according to the grammar in Appendix G are among the parts that will be ignored according to the rules for handling parsing errors.

stylesheet  : [ CDO | CDC | S | statement ]*;
statement   : ruleset | at-rule;
at-rule     : ATKEYWORD S* any* [ block | ';' S* ];
block       : '{' S* [ any | block | ATKEYWORD S* | ';' S* ]* '}' S*;
ruleset     : selector? '{' S* declaration? [ ';' S* declaration? ]* '}' S*;
selector    : any+;
declaration : DELIM? property S* ':' S* value;
property    : IDENT;
value       : [ any | block | ATKEYWORD S* ]+;
any         : [ IDENT | NUMBER | PERCENTAGE | DIMENSION | STRING
              | DELIM | URI | HASH | UNICODE-RANGE | INCLUDES
              | DASHMATCH | FUNCTION S* any* ')' 
              | '(' S* any* ')' | '[' S* any* ']' ] S*;

COMMENT tokens do not occur in the grammar (to keep it readable), but any number of these tokens may appear anywhere between other tokens.

The token S in the grammar above stands for whitespace. Only the characters "space" (U+0020), "tab" (U+0009), "line feed" (U+000A), "carriage return" (U+000D), and "form feed" (U+000C) can occur in whitespace. Other space-like characters, such as "em-space" (U+2003) and "ideographic space" (U+3000), are never part of whitespace.

The meaning of input that cannot be tokenized or parsed is undefined in CSS 2.1.

4.1.2 Keywords

Keywords have the form of identifiers. Keywords must not be placed between quotes ("..." or '...'). Thus,

red

is a keyword, but

"red"

is not. (It is a string.) Other illegal examples:

Illegal example(s):


width: "auto";
border: "none";
background: "red";

4.1.2.1 Vendor-specific extensions

In CSS, identifiers may begin with '-' (dash) or '_' (underscore). Keywords and property names beginning with -' or '_' are reserved for vendor-specific extensions. Such vendor-specific extensions should have one of the following formats:

'-' + vendor identifier + '-' + meaningful name
'_' + vendor identifier + '-' + meaningful name

Example(s):

For example, if XYZ organization added a property to describe the color of the border on the East side of the display, they might call it -xyz-border-east-color.

Other known examples:

-moz-box-sizing
-moz-border-radius
-wap-accesskey

An initial dash or underscore is guaranteed never to be used in a property or keyword by any current or future level of CSS. Thus typical CSS implementations may not recognize such properties and may ignore them according to the rules for handling parsing errors. However, because the initial dash or underscore is part of the grammar, CSS 2.1 implementers should always be able to use a CSS-conforming parser, whether or not they support any vendor-specific extensions.

Authors should avoid vendor-specific extensions

4.1.2.2 Informative Historical Notes

This section is informative.

At the time of writing, the following prefixes are known to exist:

prefixorganization
-ms-Microsoft
mso-Microsoft Office
-moz-Mozilla
-o-Opera Software
-atsc-Advanced Television Standards Committee
-wap-The WAP Forum

4.1.3 Characters and case

The following rules always hold:

4.1.4 Statements

A CSS style sheet, for any level of CSS, consists of a list of statements (see the grammar above). There are two kinds of statements: at-rules and rule sets. There may be whitespace around the statements.

4.1.5 At-rules

At-rules start with an at-keyword, an '@' character followed immediately by an identifier (for example, '@import', '@page').

An at-rule consists of everything up to and including the next semicolon (;) or the next block, whichever comes first.

CSS 2.1 user agents must ignore any '@import' rule that occurs inside a block or after any valid statement other than an @charset or an @import rule.

Illegal example(s):

Assume, for example, that a CSS 2.1 parser encounters this style sheet:


@import "subs.css";
h1 { color: blue }
@import "list.css";

The second '@import' is illegal according to CSS 2.1. The CSS 2.1 parser ignores the whole at-rule, effectively reducing the style sheet to:


@import "subs.css";
h1 { color: blue }

Illegal example(s):

In the following example, the second '@import' rule is invalid, since it occurs inside a '@media' block.


@import "subs.css";
@media print {
  @import "print-main.css";
  body { font-size: 10pt }
}
h1 {color: blue }

Instead, to achieve the effect of only importing a style sheet for 'print' media, use the @import rule with media syntax, e.g.:


@import "subs.css";
@import "print-main.css" print;
@media print {
  body { font-size: 10pt }
}
h1 {color: blue }

4.1.6 Blocks

A block starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there may be any tokens, except that parentheses (( )), brackets ([ ]) and braces ({ }) must always occur in matching pairs and may be nested. Single (') and double quotes (") must also occur in matching pairs, and characters between them are parsed as a string. See Tokenization above for the definition of a string.

Illegal example(s):

Here is an example of a block. Note that the right brace between the double quotes does not match the opening brace of the block, and that the second single quote is an escaped character, and thus doesn't match the first single quote:


{ causta: "}" + ({7} * '\'') }

Note that the above rule is not valid CSS 2.1, but it is still a block as defined above.

4.1.7 Rule sets, declaration blocks, and selectors

A rule set (also called "rule") consists of a selector followed by a declaration block.

A declaration block starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there must be a list of zero or more semicolon-separated (;) declarations.

The selector (see also the section on selectors) consists of everything up to (but not including) the first left curly brace ({). A selector always goes together with a declaration block. When a user agent can't parse the selector (i.e., it is not valid CSS 2.1), it must ignore the declaration block as well.

CSS 2.1 gives a special meaning to the comma (,) in selectors. However, since it is not known if the comma may acquire other meanings in future updates of CSS, the whole statement should be ignored if there is an error anywhere in the selector, even though the rest of the selector may look reasonable in CSS 2.1.

Illegal example(s):

For example, since the "&" is not a valid token in a CSS 2.1 selector, a CSS 2.1 user agent must ignore the whole second line, and not set the color of H3 to red:


h1, h2 {color: green }
h3, h4 & h5 {color: red }
h6 {color: black }

Example(s):

Here is a more complex example. The first two pairs of curly braces are inside a string, and do not mark the end of the selector. This is a valid CSS 2.1 rule.


p[example="public class foo\
{\
    private int x;\
\
    foo(int x) {\
        this.x = x;\
    }\
\
}"] { color: red }

4.1.8 Declarations and properties

A declaration is either empty or consists of a property name, followed by a colon (:), followed by a value. Around each of these there may be whitespace.

Because of the way selectors work, multiple declarations for the same selector may be organized into semicolon (;) separated groups.

Example(s):

Thus, the following rules:


h1 { font-weight: bold }
h1 { font-size: 12px }
h1 { line-height: 14px }
h1 { font-family: Helvetica }
h1 { font-variant: normal }
h1 { font-style: normal }

are equivalent to:


h1 {
  font-weight: bold;
  font-size: 12px;
  line-height: 14px;
  font-family: Helvetica;
  font-variant: normal;
  font-style: normal
}

A property name is an identifier. Any token may occur in the value. Parentheses ("( )"), brackets ("[ ]"), braces ("{ }"), single quotes (') and double quotes (") must come in matching pairs, and semicolons not in strings must be escaped. Parentheses, brackets, and braces may be nested. Inside the quotes, characters are parsed as a string.

The syntax of values is specified separately for each property, but in any case, values are built from identifiers, strings, numbers, lengths, percentages, URIs, colors, etc.

A user agent must ignore a declaration with an invalid property name or an invalid value. Every CSS 2.1 property has its own syntactic and semantic restrictions on the values it accepts.

Illegal example(s):

For example, assume a CSS 2.1 parser encounters this style sheet:


h1 { color: red; font-style: 12pt }  /* Invalid value: 12pt */
p { color: blue;  font-vendor: any;  /* Invalid prop.: font-vendor */
    font-variant: small-caps }
em em { font-style: normal }

The second declaration on the first line has an invalid value '12pt'. The second declaration on the second line contains an undefined property 'font-vendor'. The CSS 2.1 parser will ignore these declarations, effectively reducing the style sheet to:


h1 { color: red; }
p { color: blue;  font-variant: small-caps }
em em { font-style: normal }

4.1.9 Comments

Comments begin with the characters "/*" and end with the characters "*/". They may occur anywhere between tokens, and their contents have no influence on the rendering. Comments may not be nested.

CSS also allows the SGML comment delimiters ("<!--" and "-->") in certain places defined by the grammar, but they do not delimit CSS comments. They are permitted so that style rules appearing in an HTML source document (in the STYLE element) may be hidden from pre-HTML 3.2 user agents. See the HTML 4 specification ([HTML4]) for more information.

4.2 Rules for handling parsing errors

In some cases, user agents must ignore part of an illegal style sheet. This specification defines ignore to mean that the user agent parses the illegal part (in order to find its beginning and end), but otherwise acts as if it had not been there. CSS 2.1 reserves for future updates of CSS all property:value combinations and @-keywords that do not contain an identifier beginning with dash or underscore. Implementations must ignore such combinations (other than those introduced by future updates of CSS).

To ensure that new properties and new values for existing properties can be added in the future, user agents are required to obey the following rules when they encounter the following scenarios:

4.3 Values

4.3.1 Integers and real numbers

Some value types may have integer values (denoted by <integer>) or real number values (denoted by <number>). Real numbers and integers are specified in decimal notation only. An <integer> consists of one or more digits "0" to "9". A <number> can either be an <integer>, or it can be zero or more digits followed by a dot (.) followed by one or more digits. Both integers and real numbers may be preceded by a "-" or "+" to indicate the sign. -0 is equivalent to 0 and is not a negative number.

Note that many properties that allow an integer or real number as a value actually restrict the value to some range, often to a non-negative value.

4.3.2 Lengths

Lengths refer to horizontal or vertical measurements.

The format of a length value (denoted by <length> in this specification) is a <number> (with or without a decimal point) immediately followed by a unit identifier (e.g., px, em, etc.). After a zero length, the unit identifier is optional.

Some properties allow negative length values, but this may complicate the formatting model and there may be implementation-specific limits. If a negative length value cannot be supported, it should be converted to the nearest value that can be supported.

If a negative length value is set on a property that does not allow negative length values, the declaration is ignored.

There are two types of length units: relative and absolute. Relative length units specify a length relative to another length property. Style sheets that use relative units will more easily scale from one medium to another (e.g., from a computer display to a laser printer).

Relative units are:

Example(s):


h1 { margin: 0.5em }      /* em */
h1 { margin: 1ex }        /* ex */
p  { font-size: 12px }    /* px */

The 'em' unit is equal to the computed value of the 'font-size' property of the element on which it is used. The exception is when 'em' occurs in the value of the 'font-size' property itself, in which case it refers to the font size of the parent element. It may be used for vertical or horizontal measurement. (This unit is also sometimes called the quad-width in typographic texts.)

The 'ex' unit is defined by the element's first available font. The 'x-height' is so called because it is often equal to the height of the lowercase "x". However, an 'ex' is defined even for fonts that don't contain an "x".

The x-height of a font can be found in different ways. Some fonts contain reliable metrics for the x-height. If reliable font metrics are not available, UAs may determine the x-height from the height of a lowercase glyph. One possible heuristics is to look at how far the glyph for the lowercase "o" extends below the baseline, and subtract that value from the top of its bounding box. In the cases where it is impossible or impractical to determine the x-height, a value of 0.5em should be used.

Example(s):

The rule:


h1 { line-height: 1.2em }

means that the line height of "h1" elements will be 20% greater than the font size of the "h1" elements. On the other hand:


h1 { font-size: 1.2em }

means that the font-size of "h1" elements will be 20% greater than the font size inherited by "h1" elements.

When specified for the root of the document tree (e.g., "HTML" in HTML), 'em' and 'ex' refer to the property's initial value.

Pixel units are relative to the resolution of the viewing device, i.e., most often a computer display. If the pixel density of the output device is very different from that of a typical computer display, the user agent should rescale pixel values. It is recommended that the reference pixel be the visual angle of one pixel on a device with a pixel density of 96dpi and a distance from the reader of an arm's length. For a nominal arm's length of 28 inches, the visual angle is therefore about 0.0213 degrees.

For reading at arm's length, 1px thus corresponds to about 0.26 mm (1/96 inch). When printed on a laser printer, meant for reading at a little less than arm's length (55 cm, 21 inches), 1px is about 0.20 mm. On a 300 dots-per-inch (dpi) printer, that may be rounded up to 3 dots (0.25 mm); on a 600 dpi printer, it can be rounded to 5 dots.

The two images below illustrate the effect of viewing distance on the size of a pixel and the effect of a device's resolution. In the first image, a reading distance of 71 cm (28 inch) results in a px of 0.26 mm, while a reading distance of 3.5 m (12 feet) requires a px of 1.3 mm.

Showing that pixels must become
larger if the viewing distance increases   [D]

In the second image, an area of 1px by 1px is covered by a single dot in a low-resolution device (a computer screen), while the same area is covered by 16 dots in a higher resolution device (such as a 400 dpi laser printer).

Showing that more device pixels (dots)
are needed to cover a 1px by 1px area on a high-resolution device than
on a low-res one   [D]

Child elements do not inherit the relative values specified for their parent; they inherit the computed values.

Example(s):

In the following rules, the computed 'text-indent' value of "h1" elements will be 36px, not 45px, if "h1" is a child of the "body" element.


body {
  font-size: 12px;
  text-indent: 3em;  /* i.e., 36px */
}
h1 { font-size: 15px }

Absolute length units are only useful when the physical properties of the output medium are known. The absolute units are:

Example(s):


h1 { margin: 0.5in }      /* inches  */
h2 { line-height: 3cm }   /* centimeters */
h3 { word-spacing: 4mm }  /* millimeters */
h4 { font-size: 12pt }    /* points */
h4 { font-size: 1pc }     /* picas */

In cases where the used length cannot be supported, user agents must approximate it in the actual value.

4.3.3 Percentages

The format of a percentage value (denoted by <percentage> in this specification) is a <number> immediately followed by '%'.

Percentage values are always relative to another value, for example a length. Each property that allows percentages also defines the value to which the percentage refers. The value may be that of another property for the same element, a property for an ancestor element, or a value of the formatting context (e.g., the width of a containing block). When a percentage value is set for a property of the root element and the percentage is defined as referring to the inherited value of some property, the resultant value is the percentage times the initial value of that property.

Example(s):

Since child elements (generally) inherit the computed values of their parent, in the following example, the children of the P element will inherit a value of 12px for 'line-height', not the percentage value (120%):


p { font-size: 10px }
p { line-height: 120% }  /* 120% of 'font-size' */

4.3.4 URLs and URIs

URI values (Uniform Resource Identifiers, see [RFC3986], which includes URLs, URNs, etc) in this specification are denoted by <uri>. The functional notation used to designate URIs in property values is "url()", as in:

Example(s):


body { background: url("http://www.example.com/pinkish.png") }

The format of a URI value is 'url(' followed by optional whitespace followed by an optional single quote (') or double quote (") character followed by the URI itself, followed by an optional single quote (') or double quote (") character followed by optional whitespace followed by ')'. The two quote characters must be the same.

Example(s):

An example without quotes:


li { list-style: url(http://www.example.com/redball.png) disc }

Some characters appearing in an unquoted URI, such as parentheses, commas, whitespace characters, single quotes (') and double quotes ("), must be escaped with a backslash so that the resulting URI value is a URI token: '\(', '\)', '\,'.

Depending on the type of URI, it might also be possible to write the above characters as URI-escapes (where "(" = %28, ")" = %29, etc.) as described in [RFC3986].

In order to create modular style sheets that are not dependent on the absolute location of a resource, authors may use relative URIs. Relative URIs (as defined in [RFC3986]) are resolved to full URIs using a base URI. RFC 3986, section 5, defines the normative algorithm for this process. For CSS style sheets, the base URI is that of the style sheet, not that of the source document.

Example(s):

For example, suppose the following rule:


body { background: url("yellow") }

is located in a style sheet designated by the URI:

http://www.example.org/style/basic.css

The background of the source document's BODY will be tiled with whatever image is described by the resource designated by the URI

http://www.example.org/style/yellow

User agents may vary in how they handle invalid URIs or URIs that designate unavailable or inapplicable resources.

4.3.5 Counters

Counters are denoted by identifiers (see the 'counter-increment' and 'counter-reset' properties). To refer to the value of a counter, the notation 'counter(<identifier>)' or 'counter(<identifier>, <'list-style-type'>)', with optional whitespace separating the tokens, is used. The default style is 'decimal'.

To refer to a sequence of nested counters of the same name, the notation is 'counters(<identifier>, <string>)' or 'counters(<identifier>, <string>, <'list-style-type'>)' with optional whitespace separating the tokens.

See "Nested counters and scope" in the chapter on generated content for how user agents must determine the value or values of the counter. See the definition of counter values of the 'content' property for how it must convert these values to a string.

In CSS 2.1, the values of counters can only be referred to from the 'content' property. Note that 'none' is a possible <'list-style-type'>: 'counter(x, none)' yields an empty string.

Example(s):

Here is a style sheet that numbers paragraphs (p) for each chapter (h1). The paragraphs are numbered with roman numerals, followed by a period and a space:


p {counter-increment: par-num}
h1 {counter-reset: par-num}
p:before {content: counter(par-num, upper-roman) ". "}

4.3.6 Colors

A <color> is either a keyword or a numerical RGB specification.

The list of color keywords is: aqua, black, blue, fuchsia, gray, green, lime, maroon, navy, olive, orange, purple, red, silver, teal, white, and yellow. These 17 colors have the following values:

maroon #800000 red #ff0000 orange #ffA500 yellow #ffff00 olive #808000
purple #800080 fuchsia #ff00ff white #ffffff lime #00ff00 green #008000
navy #000080 blue #0000ff aqua #00ffff teal #008080
black #000000 silver #c0c0c0 gray #808080

In addition to these color keywords, users may specify keywords that correspond to the colors used by certain objects in the user's environment. Please consult the section on system colors for more information.

Example(s):


body {color: black; background: white }
h1 { color: maroon }
h2 { color: olive }

The RGB color model is used in numerical color specifications. These examples all specify the same color:

Example(s):


em { color: #f00 }              /* #rgb */
em { color: #ff0000 }           /* #rrggbb */
em { color: rgb(255,0,0) }      
em { color: rgb(100%, 0%, 0%) } 

The format of an RGB value in hexadecimal notation is a '#' immediately followed by either three or six hexadecimal characters. The three-digit RGB notation (#rgb) is converted into six-digit form (#rrggbb) by replicating digits, not by adding zeros. For example, #fb0 expands to #ffbb00. This ensures that white (#ffffff) can be specified with the short notation (#fff) and removes any dependencies on the color depth of the display.

The format of an RGB value in the functional notation is 'rgb(' followed by a comma-separated list of three numerical values (either three integer values or three percentage values) followed by ')'. The integer value 255 corresponds to 100%, and to F or FF in the hexadecimal notation: rgb(255,255,255) = rgb(100%,100%,100%) = #FFF. Whitespace characters are allowed around the numerical values.

All RGB colors are specified in the sRGB color space (see [SRGB]). User agents may vary in the fidelity with which they represent these colors, but using sRGB provides an unambiguous and objectively measurable definition of what the color should be, which can be related to international standards (see [COLORIMETRY]).

Conforming user agents may limit their color-displaying efforts to performing a gamma-correction on them. sRGB specifies a display gamma of 2.2 under specified viewing conditions. User agents should adjust the colors given in CSS such that, in combination with an output device's "natural" display gamma, an effective display gamma of 2.2 is produced. See the section on gamma correction for further details. Note that only colors specified in CSS are affected; e.g., images are expected to carry their own color information.

Values outside the device gamut should be clipped or mapped into the gamut when the gamut is known: the red, green, and blue values must be changed to fall within the range supported by the device. Users agents may perform higher quality mapping of colors from one gamut to another. For a typical CRT monitor, whose device gamut is the same as sRGB, the four rules below are equivalent:

Example(s):


em { color: rgb(255,0,0) }       /* integer range 0 - 255 */
em { color: rgb(300,0,0) }       /* clipped to rgb(255,0,0) */
em { color: rgb(255,-10,0) }     /* clipped to rgb(255,0,0) */
em { color: rgb(110%, 0%, 0%) }  /* clipped to rgb(100%,0%,0%) */

Other devices, such as printers, have different gamuts than sRGB; some colors outside the 0..255 sRGB range will be representable (inside the device gamut), while other colors inside the 0..255 sRGB range will be outside the device gamut and will thus be mapped.

Note. Mapping or clipping of color values should be done to the actual device gamut if known (which may be larger or smaller than 0..255).

4.3.7 Strings

Strings can either be written with double quotes or with single quotes. Double quotes cannot occur inside double quotes, unless escaped (e.g., as '\"' or as '\22'). Analogously for single quotes (e.g., "\'" or "\27").

Example(s):

"this is a 'string'"
"this is a \"string\""
'this is a "string"'
'this is a \'string\''

A string cannot directly contain a newline. To include a newline in a string, use an escape representing the line feed character in ISO-10646 (U+000A), such as "\A" or "\00000a". This character represents the generic notion of "newline" in CSS. See the 'content' property for an example.

It is possible to break strings over several lines, for esthetic or other reasons, but in such a case the newline itself has to be escaped with a backslash (\). For instance, the following two selectors are exactly the same:

Example(s):


a[title="a not s\
o very long title"] {/*...*/}
a[title="a not so very long title"] {/*...*/}

4.3.8 Unsupported Values

If a UA does not support a particular value, it should ignore that value when parsing style sheets, as if that value was an illegal value. For example:

Example(s):


  h3 {
    display: inline;
    display: run-in;
  }

A UA that supports the 'run-in' value for the 'display' property will accept the first display declaration and then "write over" that value with the second display declaration. A UA that does not support the 'run-in' value will process the first display declaration and ignore the second display declaration.

4.4 CSS style sheet representation

A CSS style sheet is a sequence of characters from the Universal Character Set (see [ISO10646]). For transmission and storage, these characters must be encoded by a character encoding that supports the set of characters available in US-ASCII (e.g., UTF-8, ISO 8859-x, SHIFT JIS, etc.). For a good introduction to character sets and character encodings, please consult the HTML 4 specification ([HTML4], chapter 5). See also the XML 1.0 specification ([XML10], sections 2.2 and 4.3.3, and Appendix F).

When a style sheet is embedded in another document, such as in the STYLE element or "style" attribute of HTML, the style sheet shares the character encoding of the whole document.

When a style sheet resides in a separate file, user agents must observe the following priorities when determining a style sheet's character encoding (from highest priority to lowest):

  1. An HTTP "charset" parameter in a "Content-Type" field (or similar parameters in other protocols)
  2. BOM and/or @charset (see below)
  3. <link charset=""> or other metadata from the linking mechanism (if any)
  4. charset of referring style sheet or document (if any)
  5. Assume UTF-8

Authors using an @charset rule must place the rule at the very beginning of the style sheet, preceded by no characters. (If a byte order mark is appropriate for the encoding used, it may precede the @charset rule.)

After "@charset", authors specify the name of a character encoding (in quotes). For example:

@charset "ISO-8859-1";

@charset must be written literally, i.e., the 10 characters '@charset "' (lowercase, no backslash escapes), followed by the encoding name, followed by '";'.

The name must be a charset name as described in the IANA registry. See [CHARSETS] for a complete list of charsets. Authors should use the charset names marked as "preferred MIME name" in the IANA registry.

User agents must support at least the UTF-8 encoding.

User agents must ignore any @charset rule not at the beginning of the style sheet. When user agents detect the character encoding using the BOM and/or the @charset rule, they should follow the following rules:

User agents must ignore style sheets in unknown encodings.

4.4.1 Referring to characters not represented in a character encoding

A style sheet may have to refer to characters that cannot be represented in the current character encoding. These characters must be written as escaped references to ISO 10646 characters. These escapes serve the same purpose as numeric character references in HTML or XML documents (see [HTML4], chapters 5 and 25).

The character escape mechanism should be used when only a few characters must be represented this way. If most of a style sheet requires escaping, authors should encode it with a more appropriate encoding (e.g., if the style sheet contains a lot of Greek characters, authors might use "ISO-8859-7" or "UTF-8").

Intermediate processors using a different character encoding may translate these escaped sequences into byte sequences of that encoding. Intermediate processors must not, on the other hand, alter escape sequences that cancel the special meaning of an ASCII character.

Conforming user agents must correctly map to ISO-10646 all characters in any character encodings that they recognize (or they must behave as if they did).

For example, a style sheet transmitted as ISO-8859-1 (Latin-1) cannot contain Greek letters directly: "κουρος" (Greek: "kouros") has to be written as "\3BA\3BF\3C5\3C1\3BF\3C2".

Note. In HTML 4, numeric character references are interpreted in "style" attribute values but not in the content of the STYLE element. Because of this asymmetry, we recommend that authors use the CSS character escape mechanism rather than numeric character references for both the "style" attribute and the STYLE element. For example, we recommend:


<SPAN style="font-family: L\FC beck">...</SPAN>

rather than:


<SPAN style="font-family: L&#252;beck">...</SPAN>