--- rough draft --- --- rough draft --- --- rough draft ------ rough draft ---
This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at: http://www.w3.org/pub/WWW/TR
Note: since working drafts are subject to frequent change, you are advised to reference the above URL, rather than the URLs for working drafts themselves.
The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications. This specification extends HTML to support locally executable scripts including JavaScript, VBScript, and other scripting languages and systems. The approach allows for pluggability of scripting systems and leverages the W3C Working Draft for Inserting multimedia objects into HTML3 (http://www.w3.org/pub/WWW/TR/WD-insert.html).
This specification extends HTML to support client-side scripting of HTML documents including objects embedded within HTML documents. Scripts can be supplied in separate files or embedded directly within HTML documents in a manner independent of the scripting language. Scripts allow HTML forms to process input as it is entered: to ensure that values conform to specified patterns, to check consistency between fields and to compute derived fields.
Scripts can also be used to simplify authoring of active documents. The behaviour of objects inserted into HTML documents can be tailored with scripts that respond to events generated by such objects. This enables authors to create compelling and powerful web content. Developers have been experimenting with ideas integrating executable script code within HTML. To date, the most prominent example is Netscape and Sun's "JavaScript". JavaScript code can be embedded within HTML through the use of Netscape defined SCRIPT tag.
This specification covers the extensions to HTML needed for client-side scripting, but leaves out the architectural and application programming interface issues for how scripting engines are implemented and how they communicate with the document and other objects on the same page. This specification formalizes the SCRIPT tag as defined by Netscape and currently implemented within Netscape Navigator 2.0 beta, and is intended to be compatible with it. In addition this specification provides a mechanism whereby user agents can be developed such that they can support any scripting language or system in a completely pluggable way.
In general, scripting languages manipulate the objects that the user agent creates to represent the document components, e.g. form fields and buttons. Scripting languages generally provide the means to:
Some user agents will generate objects corresponding to HTML elements in a fixed way. The ability to alter this binding using scripts provides considerable power to enhance both the behaviour and appearence of the document, allowing richer controls to be used in place of the default user agent object bindings.
Object naming conventions allow script programmers to query and change the properties of objects associated with specific HTML elements. JavaScript, for example, names objects based on the value of the HTML NAME attribute for form fields. For convenience and reusability, scripts may allow programmers to specify the behaviour of classes of objects as well as of particular named objects.
An example would be to define a subclass of text input fields which constrain their input values to conform to a settable mask pattern. The script specifies that HTML INPUT elements with CLASS=MaskedEdit are bound to this subclass of object. The mask pattern is set using the SCRIPT attribute for each of the INPUT elements involved.
In principle, HTML elements can be identified in various ways:
In practice, specific scripting languages may only offer a subset of these capabilities. For instance, JavaScript currently relies on the NAME attribute of HTML form fields. To bind events to event handlers, JavaScript requires you to place script statements in HTML attributes on the HTML elements themselves.
An important use of scripts is to enhance HTML forms. Embedding script statements with the HTML markup for each form is supported, but may be inappropriate when multiple scripting languages are needed to cover the full range of client platforms.
This specification defines a set of intrinsic events which are generated by objects associated with HTML elements. Other events may be defined on a language dependent basis, or by objects linked into documents via URLs.
The messages for each of these events include certain parameters. All events include a handle to the originating object, so that the script can send messages back to this object. Certain events include additional parameters as follows:
Handlers for events like mouse button clicks may be specific to which button was clicked. To avoid the user agent sending unwanted events, the binding of events to handlers may include constraints on event parameters such as the button number. This will be particularly beneficial if the script handler for an event is on a separate machine to the one generating the event.
Scripts can be placed in separate files, or inserted as part of an HTML document.
One way to associate an HTML document with an external script is to use a LINK element with REL=script. The HREF attribute gives the URL for the script, as in:
<LINK REL=script TYPE="application/perl; version=5.0" HREF="script.pl">
The forward relation "script" indicates that the URL specified with the HREF atttribute references a script. The TYPE attribute is optional and provides an advisory indication of the scripting language used. Authors can provide several such LINK elements, as alternative scripts, e.g. in different scripting languages. The TYPE attribute is then used by the user agent to select a script in a language for which it has support.
The SCRIPT element can be used to reference external scripts using the SRC attribute and to include script statements within the HTML document, as in:
<SCRIPT SRC=script.js language="JavaScript"> ... Additional JavaScript statements ... </SCRIPT>
The optional SRC attribute gives a URL for an external script. This is formally equivalent to using a LINK element with REL=script. External script statements are read in and evaluated before the script statements contained within the SCRIPT element itself. Functions are stored, but not executed. Functions are executed by events in the document.
HTML documents can include multiple SCRIPT elements which can be placed in the document HEAD or BODY. This allows script statements for a form to be placed near to the corresponding FORM element. Note that because script statements are evaluated when the document is loaded, attempts to reference objects will fail if these objects are defined by HTML elements which occur later in the document.
Script statements can be used to modify the document as it is being parsed. For instance, the HTML document:
<title>Test Document</title> <script language=javascript> document.write("<p><em>Hello World!</em>") </script>
Has the same effect as the document:
<title>Test Document</title> <p><em>Hello World!</em>
Warning: Language experts at MIT are concerned by this behaviour and recommend that script execution be driven only by events. For instance, the behaviour of script statements referring to linked objects shouldn't depend on the network load order of such objects. The same goes for multiple SCRIPT elements using the SRC attribute to link to external scripts.
The scripting language can be specified using an attribute on the SCRIPT element as follows:
For example, if there were an implementation of the Perl language interpreter as a COM object, the following could appear in the HTML document's HEAD section:
<insert id="perl" classid="progid:Perl.Interpreter" code="http://www.acme.com/perl/bin/perl.cab" > </insert> <script SCRIPTENGINE="#perl"> # perl script here # </script>
At least one of these attributes must be present. If more than one of these attributes are present, then SCRIPTENGINE takes precedence over TYPE, which in turn takes precedence over LANGUAGE. In the absence of any of these attributes, the scripting language is assumed to be the same as the external script referenced by the SRC attribute.
Sometimes it is convenient to be able to place one or more script statements as part of an individual HTML element. The generic SCRIPT attribute can be used with most HTML elements for this purpose. The following example is in a hypothetical object oriented scripting language and responds to Char and UnFocus events:
<INPUT TYPE=TEXT NAME="UserName" SCRIPT="Event Char(c) {check(c)}; Event UnFocus {copyvalue()}">
Only one SCRIPT attribute is permitted per element, so multiple scripting statements need to be separated in a language dependent manner, for instance with a semicolon character. The scripting language assumed for a SCRIPT attribute is determined by the most recent SCRIPT element, preceding the element in which the SCRIPT attribute occurs. In the absence of any SCRIPT elements, then the most recent LINK element with with REL=script is used. The default scripting language is assumed to be JavaScript.
For compatibility with Netscape 2.0beta and JavaScript, this specification includes a set of attributes for trapping common events. These are listed below together with the HTML elements they apply to:
In the following example, userName is a required text field. When a user attempts to leave the field, the OnUnfocus event handler calls the required() function to confirm that userName has a legal value.
<INPUT NAME="userName" OnBlur="required(this.value)">
On of the most frequent things done with HTML scripting is data validation on form INPUT tags. In Netscapes' implementation of JavaScript for example a typical INPUT tag might look like this:
<INPUT NAME="num" ONCHANGE="if (!checkNum(this.value, 1, 10)) {this.focus();this.select();} else {thanks()}" VALUE="0">
The value of the OnChange attribute is called a "scriptlet." These attribute names, like all other HTML attributes, are case insensitive. The scripting language assumed for the event handler attributes is determined by the most recent SCRIPT element, preceding the element in which the event handler attribute occurs. In the absence of any SCRIPT elements, then the most recent LINK element with with REL=script is used. The default scripting language is assumed to be JavaScript.
Note: The generic SCRIPT attribute can be used for new kinds of events as well as the intrinsic events intercepted by OnChange etc.
There are two extensions to the INSERT specification as defined in http://www.w3.org/pub/WWW/TR/WD-insert.html.
The NAME attribute allows an INSERT element to act as a new kind of HTML form field. NAME indicates that the VALUE property of the object defined by this INSERT is to be used as part of the submit process. If NAME were absent the object would be treated as though it were not actually part of the form (even though it may have appeared within a FORM block).
This attribute plays the same role as the SCRIPTENGINE attribute of the SCRIPT element. It gives authors the ability to specify the scripting engine to be used to interpret the scripting statements included with the SCRIPT attribute as part of the INSERT element. The attribute value is a URL referencing an INSERT element for the scripting engine. In its absence, the normal rules for scripting language apply.
Authors may wish to design their HTML documents to be viewable on older browsers that don't recognise the SCRIPT element. Unfortunately any script statements placed within a SCRIPT element will be visible to users. One solution is to enclose the script statements in an SGML comment, for instance:
<SCRIPT LANGUAGE="JavaScript"> <!-- to hide script contents from old browsers function square(i) { document.write("The call passed ", i ," to the function.","<BR>") return i * i } document.write("The function returned ",square(5),".") // end hiding contents from old browsers --> </SCRIPT>
Another solution is to use an SGML marked section to hide the script statements:
<SCRIPT LANGUAGE="JavaScript"> <![ %if-script [ function square(i) { document.write("The call passed ", i ," to the function.","<BR>") return i * i } document.write("The function returned ",square(5),".") ]]> </SCRIPT>
The <![ and ]]> in this example indicate the start and end of the marked section. The replacement text for the entity "%if-script" determines how SGML compliant parsers process the contents of the marked section. The following is an extract from the "Guidelines for Electronic Text Encoding and Interchange", edited by C. M. Sperberg-McQueen and Lou Burnard.
This specification suggests that the replacement text for the entity if-script is formally defined as "RCDATA" for user agents that support scripts, otherwise it is defaulted to "IGNORE" which causes the contents of the marked section to be ignored entirely.
Experiments with several widely deployed browsers suggests that marked sections can be used effectively for hiding scripts provided the following guidelines are adheredto:
Note: It would be cleaner to use "CDATA" rather than "RCDATA", but certain older browsers incorrectly treat a ">" char as the end of the marked section, thereby necessitating using ">" in place of such characters where they occur in the script. It is also impractical to place the marked section around the SCRIPT element, as this causes some very widely deployed browsers to incorrectly show the string "]]>".
This specification defines the extensions to HTML3 needed to support scripting. To ensure that scripts and plug-ins work smoothly with browser implementations from different vendors, we also need well defined application programming interfaces (APIs) for how documents communicate with plug-ins. For instance what kinds of messages can be sent by user agents to plug-ins and vice versa. It may be worth developing a language and platform independent API for this based on IDL (interface definition language). For now, this is left to individual languages and user agents.
The DTD or document type definition provides the formal definition of the allowed syntax for HTML extensions for INSERT and SCRIPT:
<!-- Content model entities imported from parent DTD: %body.content allows INSERTs to contain headers, paras, lists, form elements and even arbitrarily nested scripts --> <!ENTITY % attrs "id ID #IMPLIED -- element identifier -- class NAMES #IMPLIED -- for subclassing elements -- style CDATA #IMPLIED -- rendering annotation -- script CDATA #IMPLIED -- scripting statements -- scriptengine CDATA #IMPLIED -- URL for script engine -- dir (ltr|rtl) #IMPLIED -- I18N text direction -- lang NAME #IMPLIED -- as per RFC 1766 --"> <!ENTITY % URL "CDATA" -- universal resource locator --> <!ENTITY % Align "(top|middle|bottom|left|center|right)"> <!ENTITY % Length "CDATA" -- standard length value --> <!-- INSERT is a character-like element for inserting objects --> <!ELEMENT insert - - (param*, bodytext)> <!ATTLIST insert %attrs -- id, class, style, lang, dir -- data %URL #IMPLIED -- ref to object's data -- code %URL #IMPLIED -- ref to object's code -- classid %URL #IMPLIED -- object's UUID -- type CDATA #IMPLIED -- Internet media type -- align %Align #IMPLIED -- positioning inside document -- height %Length #IMPLIED -- suggested height -- width %Length #IMPLIED -- suggested width -- border %Length #IMPLIED -- suggested link border width -- hspace %Length #IMPLIED -- suggested horizontal gutter -- vspace %Length #IMPLIED -- suggested vertical gutter -- usemap %URL #IMPLIED -- ref to image map -- ismap (ismap) #IMPLIED -- use server image map -- > <!-- the BODYTEXT element is needed to avoid problems with SGML mixed content, but is never used in actual documents --> <!ELEMENT bodytext O O %body.content> <!ELEMENT param - O EMPTY -- named property value --> <!ATTLIST param name CDATA #REQUIRED -- property name -- value CDATA #IMPLIED -- property value -- valueref %URL #IMPLIED -- ref to object ALIAS -- type CDATA #IMPLIED -- Internet media type -- > <!-- ALIAS is allowed anywhere in document HEAD and BODY it defines an alias for an object without inserting it --> <!ELEMENT alias - - (param*, alias?)> <!ATTLIST alias id ID #REQUIRED -- defines name for alias -- data %URL #IMPLIED -- ref to object's data -- code %URL #IMPLIED -- ref to object's code -- classid %URL #IMPLIED -- object's UUID -- type CDATA #IMPLIED -- Internet media type -- > <!-- SCRIPT is a character-like element for embedding script code that can be placed anywhere in the document HEAD or BODY --> <!ELEMENT script - - (#PCDATA)*> <!ATTLIST script language CDATA #IMPLIED -- predefined script language name -- type CDATA #IMPLIED -- script language media type -- scriptengine %URL #IMPLIED -- URL for a specific script engine -- src %URL #IMPLIED -- URL for an external script -- >