%HTMLlat1; %HTMLsymbol; %HTMLspecial; ]> Extensible HTML version 1.0 Strict, reformulated in XML Schema Author: Masayasu Ishikawa (mimasa@w3.org) $Id: xhtml1-strict.xsd,v 1.12 2000/07/11 13:39:56 mimasa Exp $ DISCLAIMER: This schema is at the moment merely an author's personal experiment. Author doesn't guarantee at all whether this schema properly reformulate XHTML 1.0 Strict in XML Schema. This is the same as HTML 4 Strict except for changes due to the differences between XML and SGML. Namespace = http://www.w3.org/1999/xhtml For further information, see: http://www.w3.org/TR/xhtml1 Original copyright of DTD: Copyright (c) 1998-2000 W3C (MIT, INRIA, Keio), All Rights Reserved. This is a reformulation of the DTD version of XHTML 1.0 Strict. The DTD version is identified by the PUBLIC and SYSTEM identifiers: PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd" Get access to the xml: attribute groups for xml:lang Character mnemonic entities XML Schema doesn't directly support character mnemonic entities. To use them, use ENTITY declarations (whether in internal or external DTD subsets) ISO Latin 1 Character Entity Set for XHTML PUBLIC "-//W3C//ENTITIES Latin 1 for XHTML//EN" SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml-lat1.ent" ISO Math, Greek and Symbolic Character Entity Set for XHTML PUBLIC "-//W3C//ENTITIES Symbols for XHTML//EN" SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml-symbol.ent" ISO Special Character Entity Set for XHTML PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent" Imported Names media type, as per [RFC2045] % ContentType "CDATA" comma-separated list of media types, as per [RFC2045] % ContentTypes "CDATA" a character encoding, as per [RFC2045] % Charset "CDATA" a space separated list of character encodings, as per [RFC2045] % Charsets "CDATA" a language code, as per [RFC1766] % LanguageCode "NMTOKEN" a single character from [ISO10646] % Character "CDATA" one or more digits % Number "CDATA" tabindex attribute specifies the position of the current element in the tabbing order for the current document. This value must be a number between 0 and 32767. User agents should ignore leading zeros. space-separated list of link types % LinkTypes "CDATA" single or comma-separated list of media descriptors % MediaDesc "CDATA" a Uniform Resource Identifier, see [RFC2396] % URI "CDATA" a space separated list of Uniform Resource Identifiers % UriList "CDATA" date and time information. ISO date format Datetime "CDATA" script expression % Script "CDATA" style sheet data % StyleSheet "CDATA" used for titles etc. % Text "CDATA" render in this frame (not used in XHTML 1.0 Strict) % FrameTarget "NMTOKEN" nn for pixels or nn% for percentage length % Length "CDATA" pixel, percentage, or relative % MultiLength "CDATA" comma-separated list of MultiLength % MultiLengths "CDATA" integer representing length in pixels % Pixels "CDATA" these are used for image maps % Shape "(rect|circle|poly|default)" comma separated list of lengths % Coords "CDATA" Generic Attributes core attributes common to most elements id document-wide unique id class space separated list of classes style associated style info title advisory title/amplification internationalization attributes lang language code (backwards compatible) xml:lang language code (as per XML 1.0 spec) dir direction for weak/neutral text attributes for common UI events onclick a pointer button was clicked ondblclick a pointer button was double clicked onmousedown a pointer button was pressed down onmouseup a pointer button was released onmousemove a pointer was moved onto the element onmouseout a pointer was moved away from the element onkeypress a key was pressed and released onkeydown a key was pressed down onkeyup a key was released attributes for elements that can get the focus accesskey accessibility key character tabindex position in tabbing order (0-32767) onfocus the element got the focus onblur the element lost the focus Common attribute sets (in DTD, %attrs;) Text Elements "br | span | bdo | object | img | map" "tt | i | b | big | small" "em | strong | dfn | code | q | sub | sup | samp | kbd | var | cite | abbr | acronym" "input | select | textarea | label | button" these can occur at block or inline level "ins | del | script | noscript" % inline "a | %special; | %fontstyle; | %phrase; | %inline.forms;" For convenience, this is defined to have "mixed" content rather than "elementOnly". %Inline; covers inline or "text-level" elements % Inline "(#PCDATA | %inline; | %misc;)*" Block level elements % heading "h1|h2|h3|h4|h5|h6" define a complexType for lists (suitable for ul and ol) "ul | ol | dl" "pre | hr | blockquote | address" define a complexType for Block elements % block "p | %heading; | div | %lists; | %blocktext; | fieldset | table" % Block "(%block; | form | %misc;)*" define a complexType for Flow %Flow; mixes Block and Inline and is used for list items etc. % Flow "(#PCDATA | %block; | form | %inline; | %misc;)*" Content models for exclusions a elements use %Inline; excluding a % a.content "(#PCDATA | %special; | %fontstyle; | %phrase; | %inline.forms; | %misc;)*" This complexType can be just a deriving type of "html:common", but here it is defined as a deriving type (by restriction) of "html:InlineType", so that people can understand that this is a restricted type of "html:InlineType". pre uses %inline excluding img, object, big, small, sup or sup % pre.content "(#PCDATA | a | br | span | bdo | map | tt | i | b | %phrase; | %inline.forms;)*" This complexType can be just a deriving type of "html:common", but here it is defined as a deriving type (by restriction) of "html:InlineType", so that people can understand that this is a restricted type of "html:InlineType". form uses %Block; excluding form % form.content "(%block; | %misc;)*" This complexType can be just a deriving type of "html:common", but here it is defined as a deriving type (by restriction) of "html:BlockType", so that people can understand that this is a restricted type of "html:BlockType". button uses %Flow; but excludes a, form and form controls % button.content "(#PCDATA | p | %heading; | div | %lists; | %blocktext; | table | %special; | %fontstyle; | %phrase; | %misc;)*" This complexType can be just a deriving type of "html:common", but here it is defined as a deriving type (by restriction) of "html:FlowType", so that people can understand that this is a restricted type of "html:FlowType". Document Structure style, meta and link elements are defined as equivalent class of head.misc. script and object elements are not included (for now). Document Head % head.misc "(script|style|meta|link|object)*" content model is %head.misc; combined with a single title and an optional base element in any order content model is %head.misc; combined with a single title and an optional base element in any order The title element is not considered part of the flow of text. It should be displayed, for example as the page header or window title. Exactly one title is required per document. document base URI generic metainformation Relationship values can be used in principle: a) for document specific toolbars/menus when used with the link element in document head e.g. start, contents, previous, next, index, end, help b) to link to a separate style sheet (rel="stylesheet") c) to make a link to a script (rel="script") d) by stylesheets to control how collections of html nodes are rendered into printed documents e) to make a link to a printable version of this document e.g. a PostScript or PDF version (rel="alternate" media="print") style information script statements alternate content container for non script-based rendering Document Body generic language/style container Paragraphs Headings There are six levels of headings from h1 (the most important) to h6 (the least important). Lists Unordered list Ordered (numbered) list list item definition lists - dt for term, dd for its definition Address - information on author Horizontal Rule Preformatted Text content is %Inline; excluding "img|object|big|small|sub|sup" Block-like Quotes Inserted/Deleted Text ins/del are allowed in block and inline content, but its inappropriate to include block content within an ins element occurring in inline content. The Anchor Element content is %Inline; except that anchors shouldn't be nested Inline Elements generic language/style container I18N BiDi over-ride forced line break emphasis strong emphasis definitional program code sample something user would type variable citation abbreviation acronym inlined quote subscript superscript fixed pitch font italic font bold font bigger font smaller font Object object is used to embed objects as part of HTML pages. param elements should precede other content. Parameters can also be expressed as attribute/value pairs on the object element itself when brevity is desired. param is used to supply a named property value. In XML it would seem natural to follow RDF and support an abbreviated syntax where the param elements are replaced by attribute value pairs on the object start tag. Images To avoid accessibility problems for people who aren't able to see the image, you should provide a text description using the alt and longdesc attributes. In addition, avoid the use of server-side image maps. Note that in this scheme there is no name attribute. That is only available in the transitional and frameset schemas. usemap points to a map element which may be in this document or an external document, although the latter is not widely supported Client-side image maps These can be placed in the same document or grouped in a separate document although this isn't yet widely supported Forms forms shouldn't be nested Each label must not contain more than ONE field Label elements shouldn't be nested. % InputType "(text | password | checkbox | radio | submit | reset | file | hidden | image | button)" the name attribute is required for all but submit & reset option selector option group selectable choice multi-line text field The fieldset element is used to group form fields. Only one legend element should occur in the content and if present should only be preceded by whitespace. fieldset label push button Content is %Flow; excluding a, form and form controls Tables Derived from IETF HTML table standard, see [RFC1942] The border attribute sets the thickness of the frame around the table. The default units are screen pixels. The frame attribute specifies which parts of the frame around the table should be rendered. The values are not the same as CALS to avoid a name clash with the valign attribute. % TFrame "(void|above|below|hsides|lhs|rhs|vsides|box|border)" The rules attribute defines which rules to draw between cells: If rules is absent then assume: "none" if border is absent or border="0" otherwise "all" % TRules "(none | groups | rows | cols | all)" horizontal placement of table relative to document % TAlign "(left|center|right)" horizontal alignment attributes for cell contents char alignment char, e.g. char=':' charoff offset for alignment char % cellhalign "align (left|center|right|justify|char) #IMPLIED char %Character; #IMPLIED charoff %Length; #IMPLIED" vertical alignment attributes for cell contents % cellvalign "valign (top|middle|bottom|baseline) #IMPLIED" I believe table content model in XHTML 1.0 is wrong. Tables Module in XHTML Modularization does it right. % CAlign "(top|bottom|left|right)" colgroup groups a set of col elements. It allows you to group several semantically related columns together. col elements define the alignment properties for cells in one or more columns. The width attribute specifies the width of the columns, e.g. width=64 width in screen pixels width=0.5* relative width of 0.5 The span attribute causes the attributes of one col element to apply to more than one column. define a complexType for row groups (thead, tfoot, tbody) Use thead to duplicate headers when breaking table across page boundaries, or for static headers when tbody sections are rendered in scrolling panel. Use tfoot to duplicate footers when breaking table across page boundaries, or for static footers when tbody sections are rendered in scrolling panel. Use multiple tbody sections when rules are needed between groups of table rows. Scope is simpler than headers attribute for common tables % Scope "(row|col|rowgroup|colgroup)" define a complexType for table cells (th and td) th is for headers, td for data and for cells acting as both