W3C WD-script-960124

HTML3 Scripting

W3C Working Draft 24-Jan-96

This version:
http://www.w3.org/pub/WWW/TR/WD-script-960124.html
Latest version:
http://www.w3.org/pub/WWW/TR/WD-script.html
Editor:
Dave Raggett <dsr@w3.org>
Based on an initial draft by Charlie Kindel, and in turn derived from the Netscape extensions for JavaScript
Authors:
this will be added to as we evolve the draft

--- rough draft --- --- rough draft --- --- rough draft ------ rough draft ---


Status of this document

This is a W3C Working Draft for review by W3C members and other interested parties. It is a draft document and may be updated, replaced or obsoleted by other documents at any time. It is inappropriate to use W3C Working Drafts as reference material or to cite them as other than "work in progress". A list of current W3C working drafts can be found at: http://www.w3.org/pub/WWW/TR

Note: since working drafts are subject to frequent change, you are advised to reference the above URL, rather than the URLs for working drafts themselves.

Abstract

The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents with generic semantics that are appropriate for representing information from a wide range of applications. This specification extends HTML to support locally executable scripts including JavaScript, VBScript, and other scripting languages and systems. The approach allows for pluggability of scripting systems and leverages the W3C Working Draft for Inserting multimedia objects into HTML3 (http://www.w3.org/pub/WWW/TR/WD-insert.html).


Contents


Introduction

HTML documents are typically retrieved over the net from files held on HTTP servers. In some cases, the HTML is generated dynamically as the result of a program run by the server, e.g. via a CGI script. This specification extends HTML to support client-side scripting of HTML documents including objects embedded within HTML documents.

Scripts can be supplied in separate files or embedded directly within HTML documents in a manner independent of the scripting language. Scripts allow HTML forms to process input as it is entered: to ensure that values conform to specified patterns, to check consistency between fields and to compute derived fields. Scripts can also be used to simplify authoring of active documents. The behaviour of objects inserted into HTML documents can be tailored with scripts that respond to events generated by such objects.

This enables authors to create compelling and powerful web content. Developers have been experimenting with ideas integrating executable script code within HTML. To date, the most prominent example is Netscape and Sun's "JavaScript". JavaScript code is embedded within HTML through the use of Netscape defined SCRIPT tag.

This specification formalizes the SCRIPT tag as defined by Netscape and currently implemented within Netscape Navigator 2.0 beta, and is intended to be compatible with it. In addition this specification provides a mechanism whereby user agents can be developed such that they can support any scripting language or system in a completely pluggable way.

This specification covers the syntax and semantics for inserting scripts into HTML documents, but leaves out the architectural and application programming interface issues for how scripting engines are implemented and how they communicate with the document and other objects on the same page.


The Computational Model for HTML Scripts

An HTML document may be considered as a sequence of objects specified by HTML elements, for example a header, a radio button in a form and a Java applet referenced by an INSERT element. Scripts are programs that can send messages to these objects, to change or query their state. Some example of possible messages are:

Identification of Objects

The first generation of scripting languages use the HTML ID attribute to identify objects, as this is unique throughout the current document. Other ways in which objects could be identified include:

Events Generated by Objects

In addition to responding to messages sent from scripts, objects may generate messages themselves. This specification defines a fixed set of events that HTML objects may generate, and which scripts can respond to. External Objects inserted into the HTML document, e.g. using the INSERT element can generate an essentially unlimited range of event types. The fixed events defined by Netscape for HTML are:

OnLoad
Sent by an object when it is first loaded. This allows script handlers to initialize objects and to allocate resources to be associated with the object.
OnUnload
Sent by an object when it is being unloaded. This allows resources associated with the object to be freed.
OnClick
Sent by an object when the user clicks on it with a pointer device.
OnMouseOver
Sent by an object as the pointer device is moved across it.
OnFocus
Sent by a form field when it gains the keyboard input focus.
OnBlur
Sent by a form field when it looses the keyboard input focus.
OnSubmit
Sent form elements when the user submits the form.
OnSelect
Sent by an object when the user selects some of the text within a form text field.
OnChange
Sent by an object when it changes, e.g. when the user alters the textual contents of a text field in a form.

The following additional events are proposed to simplify scripting and to provide a more complete set of events for intrinsic HTML elements.

OnHide
Sent by an object when it is hidden, either by another window, or when the document is pushed onto the history stack when the user follows a link to another page.
OnShow
Sent by an object when it is shown, e.g. when the user backtracks to an earlier page, or when the page is shown following a click on a link. This is the inverse of the OnHide event.
OnDoubleClick
Sent by an object when the user double clicks on it with a pointer device.
OnTrack
This is a synonym for OnMouseOver and sent when the pointer device is moved over an object.
OnDrag
Sent by an object as it is dragged by a pointer device.
OnDrop
Sent by an object when it is dropped after a drag operation.
OnChar
Sent by the object with the focus when the user types a character.
OnUnfocus
This is a synonym for OnBlur and sent when the object looses the input focus.

The messages for each of these events include certain parameters. All events include a handle to the originating object, so that the script can send messages back to the object that sent the event. Certain events include additional parameters as follows:

OnClick
OnDoubleClick
The number of the button which was clicked as well as the x and y offset for the location clicked.
OnTrack
OnDrag
OnDrop
The x and y offset of the current pointer position.
OnChar
The character code and any associated modifiers.

Note: Do we want to include a formal definition of events using say IDL? What additional parameters should be included and for what event types?

We need to define which events are generate by what HTML elements.

Associating scripts with HTML

Scripts can be placed in separate files and referenced from the HTML document via URLs. One way of achieving this is to use the LINK element and REL=script, e.g.

    <LINK REL=script TYPE="application/perl; version=5.0" HREF="script.pl">

The forward relation "script" indicates that the URL specified with the HREF atttribute references a script. The TYPE attribute is optional and provides an advisory indication of the scripting language used. Authors can provide several such LINK elements, as alternative scripts, e.g. in different scripting languages. The TYPE attribute is then used by the user agent to select a script in a language for which it has support.

The SCRIPT element can be used to reference external scripts using the SRC attribute and to include script statements within the HTML document, e.g.

    <SCRIPT SRC=script.js>
    ... Additional JavaScript statements ... 
    </SCRIPT>

The optional SRC attribute gives a URL to an external script. This is formally equivalent to using a LINK element with REL=script. External script statements are read in and evaluated before the script statements contained within the SCRIPT element itself. Functions are stored, but not executed. Functions are executed by events in the document.

HTML documents can include multiple SCRIPT elements which can be placed in the document HEAD or BODY. This allows script statements for a form to be placed near to the corresponding FORM element. The scripting language can be specified using an attribute on the SCRIPT element as follows:

LANGUAGE
The LANGUAGE attribute provides a text string that identifies the programming language, e.g. LANGUAGE="JavaScript".
TYPE
The TYPE attribute specifies an Internet media type and associated parameters for the scripting language, e.g. TYPE="application/perl; version=5.0"
SCRIPTENGINE
The INSERT element enables arbitrary code ("applets", if you will) to be downloaded, installed and executed. This allows script interpreters to be treated as just another component to be inserted into an HTML document. The SCRIPTENGINE attribute specifies the interpreter (or script engine) via a URL that names the INSERT element for the interpreter, for example:

For example, if there were an implementation of the Perl language interpreter as a COM object, the following could appear in the HTML document's HEAD section:

    <insert
       id="PERL"
       classid="progid:Perl.Interpreter"
       code="http://www.acme.com/perl/bin/perl.cab"
    >
    </insert>
    <script SCRIPTENGINE=PERL>
             # perl script here
             #
    </script>

In the absence of any of these attributes, the scripting language is assumed to be the same as the external script referenced by the SRC attribute. The default scripting language is assumed to be JavaScript.

Handling Events with HTML Attributes

This specification proposes a number of new generic attributes for HTML elements to allow event handlers to be specified in-place with HTML elements. The attributes have the same name as the events listed earlier, e.g. "OnFocus", "OnClick" etc. The attribute value is a script statement that is executed when the object associated with the element generates the corresponding event. In the following example, userName is a required text field. When a user attempts to leave the field, the OnUnfocus event handler calls the required() function to confirm that userName has a legal value.

    <INPUT NAME="userName" OnUnfocus="required(this.value)">

On of the most frequent things done with HTML scripting is data validation on form INPUT tags. In Netscapes' implementation of JavaScript for example a typical INPUT tag might look like this:

    <INPUT NAME="num"
        ONCHANGE="if (!checkNum(this.value, 1, 10)) 
            {this.focus();this.select();} else {thanks()}"
        VALUE="0">

The value of the ONCHANGE attribute is called a "scriptlet." The scripting language assumed for the event handler attributes is determined by the most recent SCRIPT element, preceding the element in which the event handler attribute occurs. In the absence of any SCRIPT elements, then the most recent LINK element with with REL=script is used. The default scripting language is assumed to be JavaScript.

Extensions to the INSERT element

There are three extensions to the INSERT specification as defined in http://www.w3.org/pub/WWW/TR/WD-insert.html.

The NAME attribute

The NAME attribute allows an INSERT element to act as a new kind of HTML form field. The attribute specifies the name to be used to label the field's data when the form is submitted. It is directly analogous to the NAME attribute on the INPUT element.

The SCRIPTENGINE attribute

This attribute plays the same role as the SCRIPTENGINE attribute of the SCRIPT element. It gives authors the ability to specify the scripting engine to be used to interpret the scripting statements included with EVENT elements as part of the INSERT element. The attribute value is a URL referencing an INSERT element for the scripting engine.

The EVENT element

The INSERT element allows a wide range of objects to be inserted into HTML documents. These objects can generate new types of events in addition to the standard ones. The EVENT element allows authors to place script statements for responding to these events as part of the contents of an INSERT element. The EVENT element has two required attributes:

NAME
A text string specifying the event name.
HANDLER
The script statement(s) to be executed when the associated object sends the named event.

In the absence of a SCRIPTENGINE attribute on the INSERT element, the scripting language assumed for the event handler is determined by the most recent SCRIPT element, preceding the element in which the event handler attribute occurs. In the absence of any SCRIPT elements, then the most recent LINK element with with REL=script is used. The default scripting language is assumed to be JavaScript.

If the HTML author wanted to use a control richer than the intrinsic HTML form text input control (for example a rich text edit control) the insert tag would look like this:

<INSERT NAME="num" ID="num"
       CLASSID=classid:CLSID_RichEdit SCRIPTENGINE=JAVASCRIPT>
    <PARAM NAME="VALUE" VALUE="0">
    <EVENT NAME="ONCHANGE" HANDLER="if (!checkNum(this.value, 1, 10)) 
        {this.focus();this.select();} else {thanks()}">
</INSERT>

ID allows script code which is not inline (e.g. is in the HEAD or elsewhere in the BODY) that wishes to manipulate this control to find it. NAME indicates that the VALUE property of this control is to be used as part of the submit process. If NAME were absent the control would be treated as though it were not actually part of the form (even though it may have appeared within a FORM block).

Deployment Issues

Authors may wish to design their HTML documents to be viewable on older browsers that don't recognise the SCRIPT element. Unfortunately any script statements placed within a SCRIPT element will be visible to users. One solution is to enclose the script statements in an SGML comment, for instance:

<SCRIPT LANGUAGE="JavaScript">
<!--  to hide script contents from old browsers
  function square(i) {
    document.write("The call passed ", i ," to the function.","<BR>")
    return i * i
  }
  document.write("The function returned ",square(5),".")
// end hiding contents from old browsers  -->
</SCRIPT>

Another solution is to use an SGML marked section to hide the script statements:

<SCRIPT LANGUAGE="JavaScript">
<![ %if-script [
  function square(i) {
    document.write("The call passed ", i ," to the function.","<BR>")
    return i * i
  }
  document.write("The function returned ",square(5),".")
]]>
</SCRIPT>

The <![ and ]]> in this example indicate the start and end of the marked section. The replacement text for the entity "%if-script" determines how SGML compliant parsers process the contents of the marked section. The following is an extract from the "Guidelines for Electronic Text Encoding and Interchange", edited by C. M. Sperberg-McQueen and Lou Burnard.

INCLUDE
The marked section should be included in the document and processed normally.
IGNORE
The marked section should be ignored entirely; if the SGML application program produces output from the document, the marked section will be excluded from the document.
CDATA
The marked section may contain strings of characters which look like SGML tags or entity references, but which should not be recognized as such by the SGML parser. (These Guidelines use such CDATA marked sections to enclose the examples of SGML tagging.)
RCDATA
The marked section may contain strings of characters which look like SGML tags, but which should not be recognized as such by the SGML parser; entity references, on the other hand, may be present and should be recognized and expanded as normal.
TEMP
The passage included in the marked section is a temporary part of the document; the marked section is used primarily to indicate its location, so that it can be removed or revised conveniently later.

This specification suggests that the replacement text for the entity if-script is formally defined as "RCDATA" for user agents that support scripts, otherwise it is defaulted to "IGNORE" which causes the contents of the marked section to be ignored entirely.

Experiments with several widely deployed browsers suggests that marked sections can be used effectively for hiding scripts provided the following guidelines are adheredto:

Note: It would be cleaner to use "CDATA" rather than "RCDATA", but certain older browsers incorrectly treat a ">" char as the end of the marked section, thereby necessitating using ">" in place of such characters where they occur in the script. It is also impractical to place the marked section around the SCRIPT element, as this causes some very widely deployed browsers to incorrectly show the string "]]>".


Further Work

This specification defines the extensions to HTML3 needed to support scripting. To ensure that scripts and plug-ins work smoothly with browser implementations from different vendors, we also need well defined application programming interfaces (APIs) for how documents communicate with plug-ins. For instance what kinds of messages can be sent by user agents to plug-ins and vice versa. It may be worth developing a language and platform independent API for this based on IDL (interface definition language).


HTML Scripting and INSERTs DTD

The DTD or document type definition provides the formal definition of the allowed syntax for HTML extensions for INSERT and SCRIPT:

<!-- Content model entities imported from parent DTD:
  %body.content allows INSERTs to contain headers, paras,
  lists, form elements and even arbitrarily nested scripts
-->

<!ENTITY % attrs
       "id      ID       #IMPLIED  -- element identifier --
        class   NAMES    #IMPLIED  -- for subclassing elements --
        style   CDATA    #IMPLIED  -- rendering annotation --
        dir   (ltr|rtl)  #IMPLIED  -- I18N text direction --
        lang    NAME     #IMPLIED  -- as per RFC 1766 --">

<!ENTITY % URL "CDATA" -- universal resource locator -->
<!ENTITY % Align
"(top|middle|bottom|left|center|right)">
<!ENTITY % Length "CDATA" -- standard length value -->

<!-- INSERT is a character-like element for inserting objects -->
<!ELEMENT insert - - (param*, event*, bodytext)>
<!ATTLIST insert
        %attrs      -- id, class, style, lang, dir --
        data    %URL     #IMPLIED   -- ref to object's data --
        code    %URL     #IMPLIED   -- ref to object's code --
        classid %URL     #IMPLIED   -- object's UUID --
        type    CDATA    #IMPLIED   -- Internet media type --
        align   %Align   #IMPLIED   -- positioning inside document --
        height  %Length  #IMPLIED   -- suggested height --
        width   %Length  #IMPLIED   -- suggested width --
        border  %Length  #IMPLIED   -- suggested link border width --
        hspace  %Length  #IMPLIED   -- suggested horizontal gutter --
        vspace  %Length  #IMPLIED   -- suggested vertical gutter --
        usemap  %URL     #IMPLIED   -- ref to image map --
        ismap   (ismap)  #IMPLIED   -- use server image map --
        >

<!-- the BODYTEXT element is needed to avoid problems with
      SGML mixed content, but is never used in actual documents -->
<!ELEMENT bodytext O O %body.content>

<!ELEMENT param - O EMPTY -- named property value -->
<!ATTLIST param
        name    CDATA    #REQUIRED  -- property name --
        value   CDATA    #IMPLIED   -- property value --
        valueref  %URL   #IMPLIED   -- ref to object ALIAS --
        type    CDATA    #IMPLIED   -- Internet media type --
        >

<!-- ALIAS is allowed anywhere in document HEAD and BODY
     it defines an alias for an object without inserting it -->
<!ELEMENT alias - - (param*, alias?)>
<!ATTLIST alias
        id      ID       #REQUIRED  -- defines name for alias --
        data    %URL     #IMPLIED   -- ref to object's data --
        code    %URL     #IMPLIED   -- ref to object's code --
        classid %URL     #IMPLIED   -- object's UUID --
        type    CDATA    #IMPLIED   -- Internet media type --
        >

<-- EVENT element used to trap events generated by INSERTs -->
<!ELEMENT event - O EMPTY>
<!ATTLIST event
        name    CDATA    #REQUIRED  -- name of event --
        handler CDATA    #IMPLIED   -- script statement(s) to execute --
        >

<!-- SCRIPT is a character-like element for embedding script code
      that can be placed anywhere in the document head or body -->
<!ELEMENT script - - (#PCDATA)*>
<!ATTLIST script
        language     CDATA    #IMPLIED -- predefined script language name --
        type         CDATA    #IMPLIED -- script language media type --
        scriptengine %URL     #IMPLIED -- URL for a specific script engine --
        src          %URL     #IMPLIED -- URL for an external script --
        >

References

Internet Media Types - RFC 1590
J. Postel. "Media Type Registration Procedure." RFC 1590, USC/ISI, March 1994. This can be found at ftp://ds.internic.net/rfc/rfc1590.txt.
MIME - RFC 1521
Borenstein N., and N. Freed, "MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies", RFC 1521, Bellcore, Innosoft, September 1993. This can be found at ftp://ds.internic.net/rfc/rfc1521.txt
SGML Marked Sections
Dan Connelly has a paper on the use of SGML Marked Sections at http://www.w3.org/pub/WWW/MarkUp/WD-doctypes. And the TEI also has information: http://www.ebt.com/usrbooks/teip3/2404.
The Component Object Model specification
This is available from http://www.microsoft.com/intdev/inttech/comintro.htm.

W3C: The World Wide Web Consortium: http://www.w3.org/