4 Syntax and basic data types

Contents

4.1 Syntax

This section describes a grammar (and forward-compatible parsing rules) common to any version of CSS (including CSS 2.1). Future versions of CSS will adhere to this core syntax, although they may add additional syntactic constraints.

These descriptions are normative. They are also complemented by the normative grammar rules presented in Appendix G.

4.1.1 Tokenization

All levels of CSS — level 1, level 2, and any future levels — use the same core syntax. This allows UAs to parse (though not completely understand) style sheets written in levels of CSS that didn't exist at the time the UAs were created. Designers can use this feature to create style sheets that work with older user agents, while also exercising the possibilities of the latest levels of CSS.

At the lexical level, CSS style sheets consist of a sequence of tokens. The list of tokens for CSS 2.1 is as follows. The definitions use Lex-style regular expressions. Octal codes refer to ISO 10646 ([ISO10646]). As in Lex, in case of multiple matches, the longest match determines the token.

Token Definition

IDENT {ident}
ATKEYWORD @{ident}
STRING {string}
HASH #{name}
NUMBER {num}
PERCENTAGE {num}%
DIMENSION {num}{ident}
URI url\({w}{string}{w}\)
|url\({w}([!#$%&*-~]|{nonascii}|{escape})*{w}\)
UNICODE-RANGE U\+[0-9A-F?]{1,6}(-[0-9A-F]{1,6})?
CDO <!--
CDC -->
; ;
{ \{
} \}
( \(
) \)
[ \[
] \]
S [ \t\r\n\f]+
COMMENT \/\*[^*]*\*+([^/*][^*]*\*+)*\/
FUNCTION {ident}\(
INCLUDES ~=
DASHMATCH |=
DELIM any other character not matched by the above rules, and neither a single nor a double quote

The macros in curly braces ({}) above are defined as follows:

Macro Definition

ident [-]?{nmstart}{nmchar}*
name {nmchar}+
nmstart [_a-zA-Z]|{nonascii}|{escape}
nonascii[^\0-\177]
unicode \\[0-9a-f]{1,6}(\r\n|[ \n\r\t\f])?
escape {unicode}|\\[ -~\200-\4177777]
nmchar [_a-zA-Z0-9-]|{nonascii}|{escape}
num [0-9]+|[0-9]*\.[0-9]+
string {string1}|{string2}
string1 \"([\t !#$%&(-~]|\\{nl}|\'|{nonascii}|{escape})*\"
string2 \'([\t !#$%&(-~]|\\{nl}|\"|{nonascii}|{escape})*\'
nl \n|\r\n|\r|\f
w [ \t\r\n\f]*

Below is the core syntax for CSS. The sections that follow describe how to use it. Appendix G describes a more restrictive grammar that is closer to the CSS level 2 language.

stylesheet  : [ CDO | CDC | S | statement ]*;
statement   : ruleset | at-rule;
at-rule     : ATKEYWORD S* any* [ block | ';' S* ];
block       : '{' S* [ any | block | ATKEYWORD S* | ';' S* ]* '}' S*;
ruleset     : selector? '{' S* declaration? [ ';' S* declaration? ]* '}' S*;
selector    : any+;
declaration : DELIM? property S* ':' S* value;
property    : IDENT;
value       : [ any | block | ATKEYWORD S* ]+;
any         : [ IDENT | NUMBER | PERCENTAGE | DIMENSION | STRING
              | DELIM | URI | HASH | UNICODE-RANGE | INCLUDES
              | DASHMATCH | FUNCTION S* any* ')' 
              | '(' S* any* ')' | '[' S* any* ']' ] S*;

COMMENT tokens do not occur in the grammar (to keep it readable), but any number of these tokens may appear anywhere between other tokens.

The token S in the grammar above stands for whitespace. Only the characters "space" (Unicode code 32), "tab" (9), "line feed" (10), "carriage return" (13), and "form feed" (12) can occur in whitespace. Other space-like characters, such as "em-space" (8195) and "ideographic space" (12288), are never part of whitespace.

4.1.2 Keywords

Keywords have the form of identifiers. Keywords must not be placed between quotes ("..." or '...'). Thus,

red

is a keyword, but

"red"
is not. (It is a string.) Other illegal examples:

Illegal example(s):

width: "auto";
border: "none";
background: "red";

Vendor-specific extensions

In CSS2.1, identifiers may begin with '-' (dash) or '_' (underscore). Keywords and property names, beginning with -' or '_' are reserved for vendor-specific extensions. Such vendor-specific extensions should have one of the following formats:

'-' + vendor identifier + '-' + meaningful name
'_' + vendor identifier + '-' + meaningful name

Example(s):

For example, if XYZ organization added a property to describe the color of the border on the East side of the display, they might call it -xyz-border-east-color.

Other known examples:

-moz-box-sizing
-moz-border-radius
-wap-accesskey

An initial dash or underscore is guaranteed never to be used in a property or keyword by any current or future level of CSS. Thus typical CSS implementations may not recognize such properties and may ignore them according to the rules for handling parsing errors. However, because the initial dash or underscore is part of the grammar, CSS2.1 implementers should always be able to use a CSS-conforming parser, whether or not they support any vendor-specific extensions.

Informative Historical Notes

This section is informative.

At the time of writing, the following prefixes are known to exist:

prefixorganizationnotes
mso-Microsoft CorporationCreated before the working group established a naming convention for extensions.
-moz-The Mozilla Organization
-o-Opera Software
-atsc-Advanced Television Standards Committee
-wap-The WAP Forum

Vendor/organization specific extensions should be avoided.

4.1.3 Characters and case

The following rules always hold:

4.1.4 Statements

A CSS style sheet, for any version of CSS, consists of a list of statements (see the grammar above). There are two kinds of statements: at-rules and rule sets. There may be whitespace around the statements.

In this specification, the expressions "immediately before" or "immediately after" mean with no intervening whitespace or comments.

4.1.5 At-rules

At-rules start with an at-keyword, an '@' character followed immediately by an identifier (for example, '@import', '@page').

An at-rule consists of everything up to and including the next semicolon (;) or the next block, whichever comes first. A CSS user agent that encounters an unrecognized at-rule must ignore the whole of the at-rule and continue parsing after it.

CSS 2.1 user agents must ignore any '@import' rule that occurs inside a block or that doesn't precede all rule sets.

Illegal example(s):

Assume, for example, that a CSS 2.1 parser encounters this style sheet:

@import "subs.css";
h1 { color: blue }
@import "list.css";

The second '@import' is illegal according to CSS2.1. The CSS 2.1 parser ignores the whole at-rule, effectively reducing the style sheet to:

@import "subs.css";
h1 { color: blue }

Illegal example(s):

In the following example, the second '@import' rule is invalid, since it occurs inside a '@media' block.

@import "subs.css";
@media print {
  @import "print-main.css";
  body { font-size: 10pt }
}
h1 {color: blue }

4.1.6 Blocks

A block starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there may be any characters, except that parentheses (( )), brackets ([ ]) and braces ({ }) must always occur in matching pairs and may be nested. Single (') and double quotes (") must also occur in matching pairs, and characters between them are parsed as a string. See Tokenization above for the definition of a string.

Illegal example(s):

Here is an example of a block. Note that the right brace between the double quotes does not match the opening brace of the block, and that the second single quote is an escaped character, and thus doesn't match the first single quote:

{ causta: "}" + ({7} * '\'') }

Note that the above rule is not valid CSS 2.1, but it is still a block as defined above.

4.1.7 Rule sets, declaration blocks, and selectors

A rule set (also called "rule") consists of a selector followed by a declaration block.

A declaration-block (also called a {}-block in the following text) starts with a left curly brace ({) and ends with the matching right curly brace (}). In between there must be a list of zero or more semicolon-separated (;) declarations.

The selector (see also the section on selectors) consists of everything up to (but not including) the first left curly brace ({). A selector always goes together with a {}-block. When a user agent can't parse the selector (i.e., it is not valid CSS 2.1), it must ignore the {}-block as well.

CSS 2.1 gives a special meaning to the comma (,) in selectors. However, since it is not known if the comma may acquire other meanings in future versions of CSS, the whole statement should be ignored if there is an error anywhere in the selector, even though the rest of the selector may look reasonable in CSS 2.1.

Illegal example(s):

For example, since the "&" is not a valid token in a CSS 2.1 selector, a CSS 2.1 user agent must ignore the whole second line, and not set the color of H3 to red:

h1, h2 {color: green }
h3, h4 & h5 {color: red }
h6 {color: black }

Example(s):

Here is a more complex example. The first two pairs of curly braces are inside a string, and do not mark the end of the selector. This is a valid CSS 2.1 statement.

p[example="public class foo\
{\
    private int x;\
\
    foo(int x) {\
        this.x = x;\
    }\
\
}"] { color: red }

4.1.8 Declarations and properties

A declaration is either empty or consists of a property, followed by a colon (:), followed by a value. Around each of these there may be whitespace.

Because of the way selectors work, multiple declarations for the same selector may be organized into semicolon (;) separated groups.

Example(s):

Thus, the following rules:

h1 { font-weight: bold }
h1 { font-size: 12px }
h1 { line-height: 14px }
h1 { font-family: Helvetica }
h1 { font-variant: normal }
h1 { font-style: normal }

are equivalent to:

h1 {
  font-weight: bold;
  font-size: 12px;
  line-height: 14px;
  font-family: Helvetica;
  font-variant: normal;
  font-style: normal
}

A property is an identifier. Any character may occur in the value. Parentheses ("( )"), brackets ("[ ]"), braces ("{ }"), single quotes (') and double quotes (") must come in matching pairs, and semicolons not in strings must be escaped. Parentheses, brackets, and braces may be nested. Inside the quotes, characters are parsed as a string.

The syntax of values is specified separately for each property, but in any case, values are built from identifiers, strings, numbers, lengths, percentages, URIs, and colors.

A user agent must ignore a declaration with an invalid property name or an invalid value. Every CSS 2.1 property has its own syntactic and semantic restrictions on the values it accepts.

Illegal example(s):

For example, assume a CSS 2.1 parser encounters this style sheet:

h1 { color: red; font-style: 12pt }  /* Invalid value: 12pt */
p { color: blue;  font-vendor: any;  /* Invalid prop.: font-vendor */
    font-variant: small-caps }
em em { font-style: normal }

The second declaration on the first line has an invalid value '12pt'. The second declaration on the second line contains an undefined property 'font-vendor'. The CSS 2.1 parser will ignore these declarations, effectively reducing the style sheet to:

h1 { color: red; }
p { color: blue;  font-variant: small-caps }
em em { font-style: normal }

4.1.9 Comments

Comments begin with the characters "/*" and end with the characters "*/". They may occur anywhere between tokens, and their contents have no influence on the rendering. Comments may not be nested.

CSS also allows the SGML comment delimiters ("<!--" and "-->") in certain places, but they do not delimit CSS comments. They are permitted so that style rules appearing in an HTML source document (in the STYLE element) may be hidden from pre-HTML 3.2 user agents. See the HTML 4.0 specification ([HTML40]) for more information.

4.2 Rules for handling parsing errors

In some cases, user agents must ignore part of an illegal style sheet. This specification defines ignore to mean that the user agent parses the illegal part (in order to find its beginning and end), but otherwise acts as if it had not been there.

To ensure that new properties and new values for existing properties can be added in the future, user agents are required to obey the following rules when they encounter the following scenarios:

4.3 Values

4.3.1 Integers and real numbers

Some value types may have integer values (denoted by <integer>) or real number values (denoted by <number>). Real numbers and integers are specified in decimal notation only. An <integer> consists of one or more digits "0" to "9". A <number> can either be an <integer>, or it can be zero or more digits followed by a dot (.) followed by one or more digits. Both integers and real numbers may be preceded by a "-" or "+" to indicate the sign.

Note that many properties that allow an integer or real number as a value actually restrict the value to some range, often to a non-negative value.

4.3.2 Lengths

Lengths refer to horizontal or vertical measurements.

The format of a length value (denoted by <length> in this specification) is a <number> (with or without a decimal point) immediately followed by a unit identifier (e.g., px, em, etc.). After a zero length, the unit identifier is optional.

Some properties allow negative length values, but this may complicate the formatting model and there may be implementation-specific limits. If a negative length value cannot be supported, it should be converted to the nearest value that can be supported.

If a negative length value is set on a property that does not allow negative length values, the declaration is ignored.

There are two types of length units: relative and absolute. Relative length units specify a length relative to another length property. Style sheets that use relative units will more easily scale from one medium to another (e.g., from a computer display to a laser printer).

Relative units are:

Example(s):

h1 { margin: 0.5em }      /* em */
h1 { margin: 1ex }        /* ex */
p  { font-size: 12px }    /* px */

The 'em' unit is equal to the computed value of the 'font-size' property of the element on which it is used. The exception is when 'em' occurs in the value of the 'font-size' property itself, in which case it refers to the font size of the parent element. It may be used for vertical or horizontal measurement. (This unit is also sometimes called the quad-width in typographic texts.)

The 'ex' unit is defined by the font's 'x-height'. The x-height is so called because it is often equal to the height of the lowercase "x". However, an 'ex' is defined even for fonts that don't contain an "x".

Example(s):

The rule:

h1 { line-height: 1.2em }

means that the line height of "h1" elements will be 20% greater than the font size of the "h1" elements. On the other hand:

h1 { font-size: 1.2em }

means that the font-size of "h1" elements will be 20% greater than the font size inherited by "h1" elements.

When specified for the root of the document tree (e.g., "HTML" in HTML), 'em' and 'ex' refer to the property's initial value.

Pixel units are relative to the resolution of the viewing device, i.e., most often a computer display. If the pixel density of the output device is very different from that of a typical computer display, the user agent should rescale pixel values. It is recommended that the reference pixel be the visual angle of one pixel on a device with a pixel density of 96dpi and a distance from the reader of an arm's length. For a nominal arm's length of 28 inches, the visual angle is therefore about 0.0213 degrees.

For reading at arm's length, 1px thus corresponds to about 0.26 mm (1/96 inch). When printed on a laser printer, meant for reading at a little less than arm's length (55 cm, 21 inches), 1px is about 0.20 mm. On a 300 dots-per-inch (dpi) printer, that may be rounded up to 3 dots (0.25 mm); on a 600 dpi printer, it can be rounded to 5 dots.

The two images below illustrate the effect of viewing distance on the size of a pixel and the effect of a device's resolution. In the first image, a reading distance of 71 cm (28 inch) results in a px of 0.26 mm, while a reading distance of 3.5 m (12 feet) requires a px of 1.3 mm.

Showing that pixels must become
larger if the viewing distance increases

In the second image, an area of 1px by 1px is covered by a single dot in a low-resolution device (a computer screen), while the same area is covered by 16 dots in a higher resolution device (such as a 400 dpi laser printer).

Showing that more device pixels (dots)
are needed to cover a 1px by 1px area on a high-resolution device than
on a low-res one

Child elements do not inherit the relative values specified for their parent; they (generally) inherit the computed values.

Example(s):

In the following rules, the computed 'text-indent' value of "h1" elements will be 36px, not 45px, if "h1" is a child of the "body" element.

body {
  font-size: 12px;
  text-indent: 3em;  /* i.e., 36px */
}
h1 { font-size: 15px }

Absolute length units are only useful when the physical properties of the output medium are known. The absolute units are:

Example(s):

h1 { margin: 0.5in }      /* inches  */
h2 { line-height: 3cm }   /* centimeters */
h3 { word-spacing: 4mm }  /* millimeters */
h4 { font-size: 12pt }    /* points */
h4 { font-size: 1pc }     /* picas */

In cases where the computed length cannot be supported, user agents must approximate it in the actual value.

4.3.3 Percentages

The format of a percentage value (denoted by <percentage> in this specification) is a <number> immediately followed by '%'.

Percentage values are always relative to another value, for example a length. Each property that allows percentages also defines the value to which the percentage refers. The value may be that of another property for the same element, a property for an ancestor element, or a value of the formatting context (e.g., the width of a containing block). When a percentage value is set for a property of the root element and the percentage is defined as referring to the inherited value of some property, the resultant value is the percentage times the initial value of that property.

Example(s):

Since child elements (generally) inherit the computed values of their parent, in the following example, the children of the P element will inherit a value of 12px for 'line-height', not the percentage value (120%):

p { font-size: 10px }
p { line-height: 120% }  /* 120% of 'font-size' */

4.3.4 URL + URN = URI

URLs (Uniform Resource Locators, see [RFC1738] and [RFC1808]) provide the address of a resource on the Web. Another way of identifying resources is called URN (Uniform Resource Name). Together they are called URIs (Uniform Resource Identifiers, see [URI]). This specification uses the term URI.

URI values in this specification are denoted by <uri>. The functional notation used to designate URIs in property values is "url()", as in:

Example(s):

body { background: url("http://www.example.com/pinkish.png") }

The format of a URI value is 'url(' followed by optional whitespace followed by an optional single quote (') or double quote (") character followed by the URI itself, followed by an optional single quote (') or double quote (") character followed by optional whitespace followed by ')'. The two quote characters must be the same.

Example(s):

An example without quotes:

li { list-style: url(http://www.example.com/redball.png) disc }

Parentheses, commas, whitespace characters, single quotes (') and double quotes (") appearing in a URI must be escaped with a backslash: '\(', '\)', '\,'.

Depending on the type of URI, it might also be possible to write the above characters as URI-escapes (where "(" = %28, ")" = %29, etc.) as described in [URI].

In order to create modular style sheets that are not dependent on the absolute location of a resource, authors may use relative URIs. Relative URIs (as defined in [RFC1808]) are resolved to full URIs using a base URI. RFC 1808, section 3, defines the normative algorithm for this process. For CSS style sheets, the base URI is that of the style sheet, not that of the source document.

Example(s):

For example, suppose the following rule:

body { background: url("yellow") }

is located in a style sheet designated by the URI:

http://www.example.org/style/basic.css

The background of the source document's BODY will be tiled with whatever image is described by the resource designated by the URI

http://www.example.org/style/yellow

User agents may vary in how they handle URIs that designate unavailable or inapplicable resources.

4.3.5 Counters

Counters are denoted by identifiers (see the 'counter-increment' and 'counter-reset' properties). To refer to the value of a counter, the notation 'counter(<identifier>)' or 'counter(<identifier>, <list-style-type>)' is used. The default style is 'decimal'.

To refer to a sequence of nested counters of the same name, the notation is 'counters(<identifier>, <string>)' or 'counters(<identifier>, <string>, <list-style-type>)'. See "Nested counters and scope" in the chapter on generated content.

In CSS2, the values of counters can only be referred to from the 'content' property. Note that 'none' is a possible <list-style-type>: 'counter(x, none)' yields an empty string.

Example(s):

Here is a style sheet that numbers paragraphs (P) for each chapter (H1). The paragraphs are numbered with roman numerals, followed by a period and a space:

P {counter-increment: par-num}
H1 {counter-reset: par-num}
P:before {content: counter(par-num, upper-roman) ". "}

Counters that are not in the scope of any 'counter-reset', are assumed to have been reset to 0 by a 'counter-reset' on the root element.

4.3.6 Colors

A <color> is either a keyword or a numerical RGB specification.

The list of keyword color names is: aqua, black, blue, fuchsia, gray, green, lime, maroon, navy, olive, orange, purple, red, silver, teal, white, and yellow. These 17 colors have the following values:

maroon #800000 red #ff0000 orange #ffA500 yellow #ffff00 olive #808000
purple #800080 fuchsia #ff00ff white #ffffff lime #00ff00 green #008000
navy #000080 blue #0000ff aqua #00ffff teal #008080
black #000000 silver #c0c0c0 gray #808080

In addition to these color keywords, users may specify keywords that correspond to the colors used by certain objects in the user's environment. Please consult the section on system colors for more information.

Example(s):

body {color: black; background: white }
h1 { color: maroon }
h2 { color: olive }

The RGB color model is used in numerical color specifications. These examples all specify the same color:

Example(s):

em { color: #f00 }              /* #rgb */
em { color: #ff0000 }           /* #rrggbb */
em { color: rgb(255,0,0) }      
em { color: rgb(100%, 0%, 0%) } 

The format of an RGB value in hexadecimal notation is a '#' immediately followed by either three or six hexadecimal characters. The three-digit RGB notation (#rgb) is converted into six-digit form (#rrggbb) by replicating digits, not by adding zeros. For example, #fb0 expands to #ffbb00. This ensures that white (#ffffff) can be specified with the short notation (#fff) and removes any dependencies on the color depth of the display.

The format of an RGB value in the functional notation is 'rgb(' followed by a comma-separated list of three numerical values (either three integer values or three percentage values) followed by ')'. The integer value 255 corresponds to 100%, and to F or FF in the hexadecimal notation: rgb(255,255,255) = rgb(100%,100%,100%) = #FFF. Whitespace characters are allowed around the numerical values.

All RGB colors are specified in the sRGB color space (see [SRGB]). User agents may vary in the fidelity with which they represent these colors, but using sRGB provides an unambiguous and objectively measurable definition of what the color should be, which can be related to international standards (see [COLORIMETRY]).

Conforming user agents may limit their color-displaying efforts to performing a gamma-correction on them. sRGB specifies a display gamma of 2.2 under specified viewing conditions. User agents should adjust the colors given in CSS such that, in combination with an output device's "natural" display gamma, an effective display gamma of 2.2 is produced. See the section on gamma correction for further details. Note that only colors specified in CSS are affected; e.g., images are expected to carry their own color information.

Values outside the device gamut should be clipped: the red, green, and blue values must be changed to fall within the range supported by the device. For a typical CRT monitor, whose device gamut is the same as sRGB, the four rules below are equivalent:

Example(s):

em { color: rgb(255,0,0) }       /* integer range 0 - 255 */
em { color: rgb(300,0,0) }       /* clipped to rgb(255,0,0) */
em { color: rgb(255,-10,0) }     /* clipped to rgb(255,0,0) */
em { color: rgb(110%, 0%, 0%) }  /* clipped to rgb(100%,0%,0%) */

Other devices, such as printers, have different gamuts than sRGB; some colors outside the 0..255 sRGB range will be representable (inside the device gamut), while other colors inside the 0..255 sRGB range will be outside the device gamut and will thus be clipped.

4.3.7 Strings

Strings can either be written with double quotes or with single quotes. Double quotes cannot occur inside double quotes, unless escaped (as '\"' or as '\22'). Analogously for single quotes ("\'" or "\27").

Example(s):

"this is a 'string'"
"this is a \"string\""
'this is a "string"'
'this is a \'string\''

A string cannot directly contain a newline. To include a newline in a string, use the escape "\A" (hexadecimal A is the line feed character in Unicode, but represents the generic notion of "newline" in CSS). See the 'content' property for an example.

It is possible to break strings over several lines, for esthetic or other reasons, but in such a case the newline itself has to be escaped with a backslash (\). For instance, the following two selectors are exactly the same:

Example(s):

a[title="a not s\
o very long title"] {/*...*/}
a[title="a not so very long title"] {/*...*/}

4.3.8 Unsupported Values

If a UA does not support a particular value, it should ignore that value when parsing stylesheets, as if that value was an illegal value. For example:

Example(s):

  h3 {
    display: inline;
    display: run-in;
  }

A UA that supports the 'run-in' value for the 'display' property will accept the first display declaration and then "write over" that value with the second display declaration. A UA that does not support the 'run-in' value will process the first display declaration and ignore the second display declaration.

4.4 CSS document representation

A CSS style sheet is a sequence of characters from the Universal Character Set (see [ISO10646]). For transmission and storage, these characters must be encoded by a character encoding that supports the set of characters available in US-ASCII (e.g., ISO 8859-x, SHIFT JIS, etc.). For a good introduction to character sets and character encodings, please consult the HTML 4.0 specification ([HTML40], chapter 5), See also the XML 1.0 specification ([XML10], sections 2.2 and 4.3.3, and Appendix F.

When a style sheet is embedded in another document, such as in the STYLE element or "style" attribute of HTML, the style sheet shares the character encoding of the whole document.

When a style sheet resides in a separate file, user agents must observe the following priorities when determining a document's character encoding (from highest priority to lowest):

  1. An HTTP "charset" parameter in a "Content-Type" field.
  2. The @charset at-rule.
  3. Mechanisms of the language of the referencing document (e.g., in HTML, the "charset" attribute of the LINK element).

At most one @charset rule may appear in an external style sheet — it must not appear in an embedded style sheet — and it must appear at the very start of the document, not preceded by any characters. After "@charset", authors specify the name of a character encoding. The name must be a charset name as described in the IANA registry (See [IANA]. Also, see [CHARSETS] for a complete list of charsets). For example:

Example(s):

@charset "ISO-8859-1";

This specification does not mandate which character encodings a user agent must support.

Note that reliance on the @charset construct theoretically poses a problem since there is no a priori information on how it is encoded. In practice, however, the encodings in wide use on the Internet are either based on ASCII, UTF-16, UCS-4, or (rarely) on EBCDIC. This means that in general, the initial byte values of a document enable a user agent to detect the encoding family reliably, which provides enough information to decode the @charset rule, which in turn determines the exact character encoding.

4.4.1 Referring to characters not represented in a character encoding

A style sheet may have to refer to characters that cannot be represented in the current character encoding. These characters must be written as escaped references to ISO 10646 characters. These escapes serve the same purpose as numeric character references in HTML or XML documents (see [HTML40], chapters 5 and 25).

The character escape mechanism should be used when only a few characters must be represented this way. If most of a document requires escaping, authors should encode it with a more appropriate encoding (e.g., if the document contains a lot of Greek characters, authors might use "ISO-8859-7" or "UTF-8").

Intermediate processors using a different character encoding may translate these escaped sequences into byte sequences of that encoding. Intermediate processors must not, on the other hand, alter escape sequences that cancel the special meaning of an ASCII character.

Conforming user agents must correctly map to Unicode all characters in any character encodings that they recognize (or they must behave as if they did).

For example, a document transmitted as ISO-8859-1 (Latin-1) cannot contain Greek letters directly: "κουρος" (Greek: "kouros") has to be written as "\3BA\3BF\3C5\3C1\3BF\3C2".

Note. In HTML 4.0, numeric character references are interpreted in "style" attribute values but not in the content of the STYLE element. Because of this asymmetry, we recommend that authors use the CSS character escape mechanism rather than numeric character references for both the "style" attribute and the STYLE element. For example, we recommend:

<SPAN style="font-family: L\FC beck">...</SPAN>

rather than:

<SPAN style="font-family: L&#252;beck">...</SPAN>