W3C WD-script-960208

HTML3 Scripting

W3C Working Draft 08-Feb-96

This version:
http://www.w3.org/pub/WWW/TR/WD-script-960208.html
Latest version:
http://www.w3.org/pub/WWW/TR/WD-script.html
Editor:
Dave Raggett <dsr@w3.org>
Based on an initial draft by Charlie Kindel, and in turn derived from the Netscape extensions for JavaScript
Authors:
this will be added to as we evolve the draft

--- rough draft --- --- rough draft --- --- rough draft ------ rough draft ---


Status of this document

This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at: http://www.w3.org/pub/WWW/TR

Note: since working drafts are subject to frequent change, you are advised to reference the above URL, rather than the URLs for working drafts themselves.

Abstract

The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications. This specification extends HTML to support locally executable scripts including JavaScript, VBScript, and other scripting languages and systems. The approach allows for pluggability of scripting systems and leverages the W3C Working Draft for Inserting multimedia objects into HTML3 (http://www.w3.org/pub/WWW/TR/WD-insert.html).


Contents


Introduction

This specification extends HTML to support client-side scripting of HTML documents including objects embedded within HTML documents. Scripts can be supplied in separate files or embedded directly within HTML documents in a manner independent of the scripting language. Scripts allow HTML forms to process input as it is entered: to ensure that values conform to specified patterns, to check consistency between fields and to compute derived fields.

Scripts can also be used to simplify authoring of active documents. The behaviour of objects inserted into HTML documents can be tailored with scripts that respond to events generated by such objects. This enables authors to create compelling and powerful web content. Developers have been experimenting with ideas integrating executable script code within HTML. To date, the most prominent example is Netscape and Sun's "JavaScript". JavaScript code can be embedded within HTML through the use of Netscape defined SCRIPT tag.

This specification covers the extensions to HTML needed for client-side scripting, but leaves out the architectural and application programming interface issues for how scripting engines are implemented and how they communicate with the document and other objects on the same page. This specification formalizes the SCRIPT tag as defined by Netscape and currently implemented within Netscape Navigator 2.0 beta, and is intended to be compatible with it. In addition this specification provides a mechanism whereby user agents can be developed such that they can support any scripting language or system in a completely pluggable way.


The Computational Model for Scripting

In general, scripting languages manipulate the objects that the user agent creates to represent the document components, e.g. form fields and buttons. Scripting languages generally provide the means to:

Some user agents will generate objects corresponding to HTML elements in a fixed way. The ability to alter this binding using scripts provides considerable power to enhance both the behaviour and appearence of the document, allowing richer controls to be used in place of the default user agent object bindings.

Binding Events to HTML Elements

Very few programming languages provide language support for binding event handler code to specific objects. Visual Basic and HyperTalk do, but JavaScript, NewtonScript, perl and python are examples of languages which do not. For this reason, this specification proposes an extension to HTML that describes these bindings. In many cases, event handlers are bound to specific objects, but sometimes it is convenient to use the same handler for a set of related objects, for instance an array of buttons, or form fields.

This proposal allows an event handler to be given as an HTML SCRIPT element. The element specifies the name of the event, and the binding of the handler to other HTML elements. The user agent knows the binding of HTML elements to objects, and can therefore determine the binding of event handlers given by SCRIPT elements to these objects. The user agent provides this information to the scripting engine along with the handler code. The scripting engine is then responsible for linking the script handlers with the corresponding objects.

If this process occurs dynamically as the document is being parsed, then these objects may not have been created yet. To preclude this situation, the HTML user agent may choose to defer passing a script handler to the script engine until all of the corresponding objects are created. In languages like Visual Basic, you can directly specify which object is associated with each event handler. In this case, the burden falls on the script engine to defer binding the handler to the object until such time as the object has been created.

The proposal allows bindings of event handler code to arbitrary HTML elements in addition to hypertext links and form fields. Each event handler is represented as a separate SCRIPT element. This authoring tools access to the name of the event and an optional parameter list in a language independent manner. A flexible mechanism is proposed for binding each SCRIPT element to HTML elements associated with objects that source events. The simplest method is by reference to an ID value. Other possibilities include the NAME attribute for form fields, element tag names and the CLASS attribute.

Executing script statements as the document is received over the net is supported by some languages, for example JavaScript. Authors should be aware that this risks scripts referring to objects or script data structures and procedures that don't yet exist. In particular, it is dangerous to rely on network load order for externally defined resources and scripts.

Object naming conventions

Object naming conventions allow script programmers to query and change the properties of objects associated with specific HTML elements. JavaScript, for example, names objects based on the value of the HTML NAME attribute for form fields. For convenience and reusability, scripts may allow programmers to specify the behaviour for sets of objects as well as individual objects.

An example would be to define a subclass of text input fields which constrain their input values to conform to a settable mask pattern. The script specifies that HTML INPUT elements with CLASS=MaskedEdit are bound to this subclass of object. The mask pattern is set using the SCRIPT attribute for each of the INPUT elements involved.

In principle, HTML elements can be identified in various ways:

In practice, specific scripting languages may only offer a subset of these capabilities. For instance, JavaScript currently relies on the NAME attribute of HTML form fields. To bind events to event handlers, JavaScript requires you to place script statements in HTML attributes on the HTML elements themselves.

An important use of scripts is to enhance HTML forms. Embedding script statements with the HTML markup for each form is supported, but may be inappropriate when multiple scripting languages are needed to cover the full range of client platforms.

HTML Intrinsic Event Model

This specification proposes a set of intrinsic events which are generated by objects associated with HTML elements. Other events may be defined on a language dependent basis, or by objects linked into documents via URLs. Additional intrinsic events may be standardized as further experience is gained with cross platform scripting languages.

OnLoad
Sent by an object when it has just been created. This allows script handlers to initialize objects and to allocate resources to be associated with the object.
OnUnload
Sent by an object when it is about to be deleted. This allows resources associated with the object to be freed.
OnHide
Sent by an object when it is hidden, either by another window, or when the document is pushed onto the history stack when the user follows a link to another page.
OnShow
Sent by an object when it is shown, e.g. when the user backtracks to an earlier page, or when the page is shown following a click on a link. This is the inverse of the OnHide event.
OnClick
Sent by an object when the user clicks on it with a pointer device.
OnDoubleClick
Sent by an object when the user double clicks on it with a pointer device.
OnTrack
Sent by an object as the pointer device is moved across it.
OnDrag
Sent by an object as it is dragged by a pointer device, e.g. when the user moves the mouse with the left button down.
OnDrop
Sent by an object when it is dropped after a drag operation, e.g. when the user releases the mouse button.
OnFocus
Sent by a form field when it gains the keyboard input focus.
OnUnFocus
Sent by a form field when it looses the keyboard input focus.
OnSubmit
Sent form elements when the user submits the form.
OnSelect
Sent by an object when the user selects some of the text within a form text field.
OnChar
Sent by the object with the focus when the user types a character.
OnChange
Sent by an object when it changes, e.g. when the user alters the textual contents of a text field in a form.

The messages for each of these events are associated with certain parameters. All events involve a handle to the originating object, allowing the script to send messages back to this object. Certain events involve additional parameters as follows:

OnClick
OnDoubleClick
The number of the button which was clicked as well as the location clicked.
OnTrack
OnDrag
OnDrop
The location of the current pointer position.
OnChar
The character code and any associated modifiers.

Handlers for events like mouse button clicks may be specific to which button was clicked. To avoid the user agent sending unwanted events, the binding of events to handlers may include constraints on event parameters such as the button number. This will be particularly beneficial if the script handler for an event is on a separate machine to the one generating the event.


HTML Scripting Extensions

Scripts can be placed in separate files, or inserted as part of an HTML document.

The LINK Element

One way to associate an HTML document with an external script is to use a LINK element with REL=script. The HREF attribute gives the URL for the script, as in:

    <LINK REL=script TYPE="application/perl; version=5.0" HREF="script.pl">

The forward relation "script" indicates that the URL specified with the HREF atttribute references a script. The TYPE attribute is optional and provides an advisory indication of the scripting language used. Authors can provide several such LINK elements, as alternative scripts, e.g. in different scripting languages. The TYPE attribute is then used by the user agent to select a script in a language for which it has support.


The SCRIPT Element

<!-- SCRIPT is a character-like element for embedding script code
      that can be placed anywhere in the document HEAD or BODY -->

<!ENTITY % Event "CDATA" -- event name and optional param list -->

<!ELEMENT script - - (#PCDATA)*>
<!ATTLIST script
        language     CDATA    #IMPLIED -- predefined script language name --
        type         CDATA    #IMPLIED -- script language media type --
        scriptengine %URL     #IMPLIED -- URL for a specific script engine --
        src          %URL     #IMPLIED -- URL for an external script --
        event        %Event   #IMPLIED -- event name for handler --
        for          %URL     #IMPLIED -- binding to HTML elements --
        >

Using SCRIPT for external scripts

The SCRIPT element can be used to reference external scripts using the SRC attribute and to include script statements within the HTML document, as in:

    <SCRIPT SRC=script.js language="JavaScript">
    ... Additional JavaScript statements ... 
    </SCRIPT>

The optional SRC attribute gives a URL for an external script. This is formally equivalent to using a LINK element with REL=script. External script statements are read in and evaluated before the script statements contained within the SCRIPT element itself. Functions are stored, but not executed. Functions are executed by events in the document, or as the result of evaluating separate script statements.

HTML documents can include multiple SCRIPT elements which can be placed in the document HEAD or BODY. This allows script statements for a form to be placed near to the corresponding FORM element. Note that because script statements are evaluated when the document is loaded, attempts to reference objects will fail if these objects are defined by HTML elements which occur later in the document.

Self-Modifying Documents

Some scripting languages permit script statements to be used to modify the document as it is being parsed. For instance, the HTML document:

    <title>Test Document</title>
    <script language=javascript>
        document.write("<p><em>Hello World!</em>")
    </script>

Has the same effect as the document:

    <title>Test Document</title>
    <p><em>Hello World!</em>

Specifying the Scripting Language

The scripting language is specified using an attribute on the SCRIPT element as follows:

LANGUAGE
The LANGUAGE attribute provides a text string that identifies the programming language, e.g. LANGUAGE="JavaScript". This method for specifying the scripting language is included for backwards compatibilitity, and may be obsoleted in future revisions to this specification.
TYPE
The TYPE attribute specifies an Internet media type and associated parameters for the scripting language, e.g. TYPE="application/perl; version=5.0". The "TYPE" attribute is commonly used in in HTML for Internet media types, e.g. for stylesheet languages with the STYLE element, and for the LINK and INSERT elements.
SCRIPTENGINE
The INSERT element enables arbitrary code ("applets", if you will) to be downloaded, installed and executed. This allows script interpreters to be treated as just another component to be inserted into an HTML document. The SCRIPTENGINE attribute specifies the interpreter (or script engine) via a URL that names the INSERT element for the interpreter, for example:

For example, if there were an implementation of the Perl language interpreter as a COM object, the following could appear in the HTML document's HEAD section:

    <insert
       id="perl"
       classid="progid:Perl.Interpreter"
       code="http://www.acme.com/perl/bin/perl.cab"
    >
    </insert>
    <script SCRIPTENGINE="#perl">
             # perl script here
             #
    </script>

At least one of these attributes must be present. If more than one of these attributes are present, then SCRIPTENGINE takes precedence over TYPE, which in turn takes precedence over LANGUAGE.


Defining Event Handlers with SCRIPT

You can include the handler for an event in an HTML document using the SCRIPT element. Each event handler needs a separate SCRIPT element. Here is an example of an event handler for a text box:

    <INPUT NAME=edit1 SIZE=20>

    <SCRIPT LANGUAGE=VBScript EVENT=OnChange FOR="name:edit1">
        If edit1.value = "abc" Then
          button1.enabled = True
        Else
          button1.enabled = False
        End If
    </SCRIPT>

This example handles "OnChange" events for the INPUT element with the the value "edit1" for the NAME attribute. The EVENT attribute defines the event name. This is either one of the intrinsic events, or an event specific to this object. The FOR attribute is used to bind the handler to appropriate HTML elements. It uses the URL fragment identifier to provide a flexible means of addressing within HTML documents. It has the following syntax:

FOR = " URL#expression "

Typically the SCRIPT element is in the same document as the HTML elements it binds to. As a result the URL part is typically void. The expression is a URL fragment identifier with its syntax matching one of the following:

id:id-value
This is used when the script binds to a single HTML element. The element must have an ID attribute with a matching value.

field:field-name
This is used when the handler is to be used for one or more form fields with the same NAME attribute

field:form-name/field-name
When there are several forms in an HTML document, there may be a possibility of the same NAME value being used by both forms. This syntax allows you to prefix the field NAME by the corresponding name for the enclosing FORM element.
tag:tag-name
This is used when a handler is to be used with all elements with the same tag name

tag:tag-name/class-name
Similar to the above, but this syntax allows you to restict the handler to be used with those elements that belong to a given subclass. That is have matching CLASS values.

... a bunch of motivating examples are needed, either here or in a separate document ...

Parameter Passing

Most events are associated with one or more parameters. For some languages, for example HyperTalk, the parameters are passed via global variables (e.g. the mouse location is accessible via "the clickLoc"). This is problematic if the global variables are overwritten by subsequent events of the same type. Other languages require explit parameter lists. The type of a parameter may be implicit, according to its position in the list, or given explicitly as with C++. Some languages use tagged data that include type info within each parameter (e.g. Poplog).

(a) Implicit Parameters

With this approach, the names and types of parameters are implied by the event name. In the example below, button and location are implicit for all OnClick events:

    <script event=OnClick
                for="id:image1" language=WebScript>
        If button = 1 Then
            ...
        End If
    </script>

(b) Named Parameters

With this approach, the EVENT attribute includes a bracketed list of parameter names after the event name. The types of the parameters are implicit depending on the event name and position in the list. This also works well for languages with tagged data types, as then, the parameters carry their own type information. The above example becomes:

    <script event="OnClick(button, location)"
                for="id:image1" language=WebScript>
        If button = 1 Then
            ...
        End If
    </script>

(c) Typed Parameters

With this approach, typing information is included with the parameter names. The types need to be mapped to the typing system in use for each scripting language. The example becomes:

    <script event="OnClick(button as integer, location as point)"
                for="id:image1" language=WebScript>
        If button = 1 Then
            ...
        End If
    </script>

For this approach to work, it seems like a standard syntax will be needed for typing information. Would the following be okay?

param-name as type-name

Where type-name is one of: integer, string, real, point, or object for a self typed object.


Compatibility with Netscape 2.0

For compatibility with Netscape 2.0beta and JavaScript, implementors may wish to support the following set of attributes for trapping common events. These are listed below together with the HTML elements they apply to:

OnLoad
A load event occurs when Navigator finishes loading a window or all frames within a <FRAMESET>. The OnLoad event handler executes JavaScript code when a load event occurs. Use the OnLoad event handler within either the <BODY> or the Netscape <FRAMESET> tag, for example, <BODY OnLoad="...">.
OnUnload
An unload event occurs when you exit a document. The OnUnload event handler executes JavaScript code when an unload event occurs. Use the OnUnload event handler within either the <BODY> or the Netscape <FRAMESET> tag, for example, <BODY OnUnload="...">.
OnClick
A click event occurs when an object on a form is clicked. The onClick event handler executes JavaScript code when a click event occurs. This event is generated by buttons, checkboxes, radio buttons, hypertext links, reset and submit buttons, and is used only with INPUT, and anchor elements.
OnMouseOver
This event is sent as the mouse is moved over an object. It is only applicable to hypertext links. This attribute is used only with anchor element.
OnFocus
A focus event occurs when a field gains the input focus by tabbing or clicking with the mouse. Selecting within a field results in a select event, not a focus event. It is generated by select menus, single and multi-line text input fields. This attribute is used only with the SELECT, INPUT and TEXTAREA elements.
OnBlur
A blur event occurs when a select menu, single and multi-line text input field on a form loses the input focus. This is Netscape's name for the UnFocus event defined earlier in this specification. It is generated by select menus, single and multi-line text input fields. This attribute is used only with the SELECT, INPUT and TEXTAREA elements.
OnSubmit
A submit event occurs when a user submits a form. JavaScript requires you to return true in the event handler to allow the form to be submitted; return false to prevent the form from being submitted. This attribute is used only with the FORM element.
OnSelect
A select event occurs when a user selects some of the text within a single or multi-line text field. This attribute is used only with the INPUT and TEXTAREA elements.
OnChange
A change event occurs when a select, single or multi-line text field loses the input focus and its value has been modified. This attribute is used only with the SELECT, INPUT and TEXTAREA elements.

In the following example, userName is a required text field. When a user attempts to leave the field, the OnUnfocus event handler calls the required() function to confirm that userName has a legal value.

    <INPUT NAME="userName" OnBlur="required(this.value)">

On of the most frequent things done with HTML scripting is data validation on form INPUT tags. In Netscapes' implementation of JavaScript for example a typical INPUT tag might look like this:

    <INPUT NAME="num"
        ONCHANGE="if (!checkNum(this.value, 1, 10)) 
            {this.focus();this.select();} else {thanks()}"
        VALUE="0">

The value of the OnChange attribute is called a "scriptlet." These attribute names, like all other HTML attributes, are case insensitive. The scripting language assumed for the event handler attributes is determined by the most recent SCRIPT element, preceding the element in which the event handler attribute occurs. In the absence of any SCRIPT elements, then the most recent LINK element with with REL=script is used. The default scripting language is assumed to be JavaScript.

Note: The generic SCRIPT attribute can be used for new kinds of events as well as the intrinsic events intercepted by OnChange etc.


Using INSERT elements as Form Fields

This extends the INSERT specification as defined in http://www.w3.org/pub/WWW/TR/WD-insert.html.

The NAME attribute - CDATA

The NAME attribute allows an INSERT element to act as a new kind of HTML form field. NAME indicates that the VALUE property of the object defined by this INSERT is to be used as part of the submit process. If NAME were absent the object would be treated as though it were not actually part of the form (even though it may have appeared within a FORM block).


Deployment Issues

Authors may wish to design their HTML documents to be viewable on older browsers that don't recognise the SCRIPT element. Unfortunately any script statements placed within a SCRIPT element will be visible to users. One solution is to enclose the script statements in an SGML comment, for instance:

<SCRIPT LANGUAGE="JavaScript">
<!--  to hide script contents from old browsers
  function square(i) {
    document.write("The call passed ", i ," to the function.","<BR>")
    return i * i
  }
  document.write("The function returned ",square(5),".")
// end hiding contents from old browsers  -->
</SCRIPT>

Another solution is to use an SGML marked section to hide the script statements:

<SCRIPT LANGUAGE="JavaScript">
<![ %if-script [
  function square(i) {
    document.write("The call passed ", i ," to the function.","<BR>")
    return i * i
  }
  document.write("The function returned ",square(5),".")
]]>
</SCRIPT>

The <![ and ]]> in this example indicate the start and end of the marked section. The replacement text for the entity "%if-script" determines how SGML compliant parsers process the contents of the marked section. The following is an extract from the "Guidelines for Electronic Text Encoding and Interchange", edited by C. M. Sperberg-McQueen and Lou Burnard.

INCLUDE
The marked section should be included in the document and processed normally.
IGNORE
The marked section should be ignored entirely; if the SGML application program produces output from the document, the marked section will be excluded from the document.
CDATA
The marked section may contain strings of characters which look like SGML tags or entity references, but which should not be recognized as such by the SGML parser. (These Guidelines use such CDATA marked sections to enclose the examples of SGML tagging.)
RCDATA
The marked section may contain strings of characters which look like SGML tags, but which should not be recognized as such by the SGML parser; entity references, on the other hand, may be present and should be recognized and expanded as normal.
TEMP
The passage included in the marked section is a temporary part of the document; the marked section is used primarily to indicate its location, so that it can be removed or revised conveniently later.

This specification suggests that the replacement text for the entity if-script is formally defined as "RCDATA" for user agents that support scripts, otherwise it is defaulted to "IGNORE" which causes the contents of the marked section to be ignored entirely.

Experiments with several widely deployed browsers suggests that marked sections can be used effectively for hiding scripts provided the following guidelines are adheredto:

  • Place the marked section around the script statements, but within the SCRIPT element itself.
  • Replace instances of "<", ">" and "&" in script statements by the SGML entities < > and & respectively.

Note: It would be cleaner to use "CDATA" rather than "RCDATA", but certain older browsers incorrectly treat a ">" char as the end of the marked section, thereby necessitating using ">" in place of such characters where they occur in the script. It is also impractical to place the marked section around the SCRIPT element, as this causes some very widely deployed browsers to incorrectly show the string "]]>".


Further Work

This specification defines the extensions to HTML3 needed to support scripting. To ensure that scripts and plug-ins work smoothly with browser implementations from different vendors, we need well defined application programming interfaces (APIs) for:

  1. How User Agents communicate with document objects
  2. How User Agents communicate with Scripting Engines
  3. How Objects and Scripts can identify and send messages to other objects on the same document
  4. Platform independent properties and events for common objects

It may be worth developing a language and platform independent API for this based on IDL (interface definition language). For now, this is left to vendors for individual languages and user agents.


References

Internet Media Types - RFC 1590
J. Postel. "Media Type Registration Procedure." RFC 1590, USC/ISI, March 1994. This can be found at ftp://ds.internic.net/rfc/rfc1590.txt.
MIME - RFC 1521
Borenstein N., and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Bellcore, Innosoft, September 1993. This can be found at ftp://ds.internic.net/rfc/rfc1521.txt
SGML Marked Sections
Dan Connelly has a paper on the use of SGML Marked Sections at http://www.w3.org/pub/WWW/MarkUp/WD-doctypes. And the TEI also has information: http://www.ebt.com/usrbooks/teip3/2404.
The Component Object Model specification
This is available from http://www.microsoft.com/intdev/inttech/comintro.htm.
OLE Scripting
An introduction to OLE Scripting is available from http://microsoft.com/intdev/inttech/olescrpt.htm
MetaCard
This is available from http://www.csn.net/MetaCard/mtd.html

W3C: The World Wide Web Consortium: http://www.w3.org/