XForms Transitional

This is an editor's draft for a specification of extensions to the HTML4 forms markup and object model that provide for spreadsheet-like capabilities without the need for scripting. For example, simple declarative expressions for calculated fields, a richer set of intrinsic data types including numbers, dates and times, simple ways to specify validation constraints, the ability to determine when a field must be filled out based upon the values of other fields, and similarly when a field or group of fields are irrelevant and can be hidden from view, and a simple means to support repeating sets of fields.

XForms Transitional enables developers to take advantage of the declarative features of XForms bind constraints whilst remaining within the framework of HTML4 and text/html. Developers can gain some of the benefits of XForms without first having to learn about XML namespaces, XPath and a whole new way of doing forms. XForms offers enterprise strength forms, and XForms Transitional provides a stepping stone to realizing that potential for when people are ready to make the transition to the power and flexibility of XML.

Assumptions

For ease of learning, this specification is limited to incremental extensions to HTML4, and to features that are motivated by existing practice and experience with spreadsheets.

To facilitate rapid deployment of XForms Transitional, it should be possible to support the extensions defined in this specification on a very high percentage of existing Web browsers via Web page script libraries as an alternative to relying on native browser support.

Simple declarative expressions offer greater convenience, and reduce the costs of developing Web applications and the likelihood of errors compared with having to develop page specific scripts for event handlers. Declarative approaches also avoid the need to keep separately written client and server-side validation code in sync, further reducing development and maintenance costs.

Device independent representations provide greater flexibility for adapting Web applications to suit the needs of particular users and devices. This offers important benefits for usability given the increasing diversity of device capabilities.

Declarative formats facilitate round tripping of semantics from editors to documents and back to the editor the next time the document is reloaded. This enables editors to hide the details of markup, style sheets and scripts. If the semantics are expressed procedurally as scripts, then only programmers will be able to understand them. Declarative formats are therefore critically important to enabling a wide cross section of the population to edit Web applications, regardless of age or background.

Markup Extensions

The markup extensions include a small set of new attributes, and some additional attribute values. Many of these attributes are used with expressions. These expressions are required to conform to a subset of the ECMA 262 r-value expression syntax and to be side-effect free and capable of statical analysis. Expressions may contain the names of fields as given by the field's name attribute. The scoping rules limit references to fields within the same form as the field on which the expression is defined and failing that, to variables defined in the global scope. Expressions may include calls to functions defined in the global scope, e.g. author defined functions specified in a script element. Expressions may be evaluated by applying some rewrite rules when the Web page is loaded and calling the ECMA 262 eval function as needed. More details are given in the section on the Object Model.

This specification does not define user agent behavior if expressions have side effects. In common with event handlers, expression evaluation should happen sufficiently quickly to maintain an reactive user interface. External functions should therefore avoid protracted computations. An exception must be raised if the expression syntax doesn't conform to the defined subset and if a cyclic dependency is detected between calculated fields.

Labels

The HTML4 label element provides a means to bind a descriptive label to a form field e.g.

<label for="f1">Name</label>
<input id="f1" name="name" />

where the for attribute is matched against the id attribute on the input element. This specification widens the definition of the for attribute to match against field name attributes in the absence of a field with matching id attribute. This has two benefits. Firstly, it allows a given label to be associated with all fields with the same name. This is particularly useful for repeated sets of fields, and where the label may be shown distinctly in some manner when any of the associated fields are invalid or are required but not filled out. Secondly, the name attribute is always required, and in many cases it is sufficient to uniquely identify the field. In such cases the id attribute is functionally redundant.

HTML4 user agents typically allow you to click on the label as an alternative to clicking on the associated field. If the label is for a group of fields with the same name, user agents could set the focus to the first such field in document order, but it would be inappropriate to activate a checkbox in this manner. The semantic association between the label and such groups of fields is important for accessibility purposes, as well as enabling the label to show when any of its field are invalid.

Extended Set of Data Types

The type attribute for the input element is extended to allow for a wider set of intrinsic data types. The new types include:

type="number"

The field value must be an integer or a floating point number as defined by ECMA 262 (ECMASCript). The number of digits after the decimal point can be controlled via the step attribute.

Whilst the step attribute can be used to control the precision of a number, it effects the stored value as well as the presentation. A much better solution would be to introduce a mechanism for controlling the presentation independently of the value held in the DOM. The mechanism should also permit the use of locale dependent formatting conventions, e.g. for the thousand's separator and for the decimal point.

type="date"

The field value must be a date that can be held with an ECMA 262 Date object. Users should be able to enter dates in a range of formats with the date's presentation being reformatted into a locale dependent canonical format when the focus is moved away and prior to the changed event being raised. Users should be able to enter dates in a manner that is convenient to their personal needs, for instance, via a sequence of key strokes, or via selecting a date with a date picker control, or by speaking the date.

The intention is to support a similar behavior to that found with date fields in spreadsheets where it is common to be able to enter dates in a variety of formats, e.g. 15-jan-07, 15/jan/07, 15/01/2007, 15 January 2007, etc. The value entered by the user is parsed into an internal representation, and for submission purposes, is formatted either in a locale dependent format or as an ISO 8601 Date. Showing an error when users type what for them is a perfectly normal data format is likely to lead to developers abandoning browser defined controls in favor of more flexible solutions.

type="time"

The field value must be correspond to a time of day in a locale dependent format, e.g. 4:30pm or 16:30. Users should be able to enter times in a manner that is convenient to their personal needs, for instance, via a sequence of key strokes, or via selecting a time with a timer picker control, or by speaking the time.

Applies to input elements.

There are a range of other types to be considered, e.g. URI and email address(es). For some cases, it is sufficient to use a text control constrained with a regular expression. For email addresses, there is the consideration of being able to offer users the means to pick addresses from their address book. For web mail applications, users would expect to be able to enter a list of addresses, in the same formats as they are used to. This suggests that there is a need for control over whether one or more addresses is permitted, and for the control to accept a range of address formats, and to reformat them as appropriate, in a manner analogous to date controls.

Field Validation

Validation constraints can be expressed using several new attributes. The field is deemed invalid if its value violates the constraints imposed by any one of these attributes, or if it violates the data type constraint associated with the "type" attribute.

It is be generaly desirable to check the value of a field as it is being typed. This is particularly appropriate when a field is constrained by a regular expression or is a simple type like a number. The evaluation of cross-field dependencies may be left until the changed event is raised.

Applies to input, select and textarea elements.

Regular Expressions

The pattern attribute is used to provide a ECMA 262 regular expression that constrains the field value to a matching text string.

Minimum Value

The min attribute is used to provide a lower bound to the field's value. This can be used for numbers, dates and times, using the formats permitted with the associated datatypes as specified for the type attribute.

Maximum Value

The max attribute is used to provide an upper bound to the field's value. This can be used for numbers, dates and times, using the formats permitted with the associated data types as specified for the type attribute.

Step Value

The step attribute is used to constrain the field's value to an integral multiple of the step plus the minimum value if the min attribute is defined, and may be used in conjunction with the max attribute to set an upper bound. The step attribute can be used for numbers, dates and times, using the formats permitted with the associated data types as specified for the type attribute.

The default value for step is infinitesimally small for type="number", one day for type="date", one minute for type="time" and the integer 1 for type="range", see below.

Constraint Expressions

The constraint attribute is used to provide an expression over field values that evaluates to a boolean result. This can be used to constrain a field based upon the value of other fields.

Range Controls

If input element has type="range" then the user agent should allow the user to pick a number in the range set by the min and max attributes. The step attribute constrains the value to be min plus a positive integral multiple of the step so long as the result is less than or equal to max. The default value of step is 1. Range controls are often implemented in graphical user interfaces as sliders or as spin controls, and should provide an indication of the numerical value selected. For accessibility, there should be an alternative to using a pointer device to set the value, for example, the means to type the value directly.

Applies to input elements.

Editable Select Controls

In HTML4 the select element presents a fixed set of choices for the user to select from. It is often desirable to give the user the means to type a value in addition to the set of predefined choices. The editable attribute can be used to enable this. The value must be either "editable" or an positive integer. If the value is an integer it is interpreted as denoting the width in characters (by analogy to the size attribute on the input element). If the absence of the attribute, only the predefined choices are presented. The maxlength attribute may be used to specify the maximum number of characters that can be entered for a user defined entry.

A compatibility library can provide provide an input element for editable selections for user agents without native support.

Calculated Field Values

The calculate attribute is used to provide an expression which is dynamically evaluated to obtain the field's value. The type of the expression is coerced to match the field's type attribute. Calculated fields may be dependent on the values of other calculated fields. The order in which such fields should be calculated can be determined through a topological sort of field dependencies. External functions called from expressions should avoid accessing fields except for those passed as parameters to the function. An expression is only considered valid if it permits static analysis, see the expression grammar as defined below.

Applies to input elements. The restrictions on external functions avoids the impracticality of trying to analyse in advance the dependencies involved in the execution of these functions.

Overriding Calculated Fields

The readonly attribute may be used to mark a field to be read only. This prevents the field from being updated by user input, although it can still be updated via the DOM. For calculated fields that are not set to read-only, the user may override the calculated value. The expression defined by the calculate attribute doesn't apply to overridden fields until either the form is reset, or it is re-initialized by being reloaded. Expressions referring to overridden fields must use the current value of the field as entered by the user in place of the value calculated using the expression defined by the calculate attribute.

Required Field Values

The required attribute is used to provide an expression which is dynamically evaluated to determine whether the field must be filled out prior to submission.

Applies to input, select and textarea elements.

Read-only Fields

The readonly attribute is extended to allow its use with an expression that evaluates to a boolean value. The term "readonly" is reserved and bound to a value of "true" for backwards compatibility with HTML4.

Applies to input, select and textarea elements.

Relevant Fields or Groups of Fields

The relevant attribute is used to provide an expression which is dynamically evaluated to determine whether a field or group of fields are currently relevant. If the expression evaluates to false, the field or fieldset is deemed to be irrelevant and may be hidden from the presentation via a suitable style sheet.

Applies to input, select and textarea elements.

Field Sets

The HTML4 fieldset element allows authors to group thematically related controls and labels. Grouping controls makes it easier for users to understand their purpose while simultaneously facilitating tabbing navigation for visual user agents and speech navigation for speech-oriented user agents. The proper use of this element makes documents more accessible. The rendering can be controlled via style sheets.

This specification adds several new attributes to the fieldset element:

Field Set Name

The name attribute may be used to reference a fielset by name from within an expression. This enables hierarchical naming of fields via the fieldset elements that contain them. For example, shipto.city which references a field named "city" within a fieldset named "shipto" as in the following example:

<label for="f10">Will be shipped to</label>
<input id="f10" name="shipCity" readonly="readonly"
 calculate="difship ? shipto.city : billto.city" />

where a field displays the value of city taken from either the billing fieldset or the shipping fieldset depending on the value of a checkbox. This is supported by an extension to the HTML DOM, see the section on Field Collections.

This could in principle be used to support submission of form data as an XML document where the fieldset and fields names are mapped into XML elements, and field values into XML text nodes.

Relevancy

The relevant attribute may be used with an expression that dynamically determines whether the fieldset and its contents are relevant. This can be combined with a suitable style sheet to hide irrelevant groups of fields.

Repeating Fieldsets

The repeat-number attribute is used together with a positive integer to indicate when the enclosed fields can be repeated. The rendering is device dependent, and on a large display may be rendered as a table with the field labels across the top, whilst on a small display as a vertical arrangement of labels and fields for a single "row" in the data set, with user-agent defined means to navigate to other rows. The value of the repeat-number attribute is a hint to the user agent for the number of "rows" to be shown, and must be an integer greater or equal to 1 if the fields are to be repeatable.

The value of the for attributes of field labels should be chosen to match against field name attributes to avoid the labels being repeated for each row. The ordering of the columns is determined by the document order of the fields within the fieldset. The ordering of the labels on the fieldset content is immaterial.

The HTML4 object model allows for multiple fields with the same name for each form. This can be used to provide initial values for repeated data sets. You just need to include the corresponding field elements. Any missing fields will be initialized to their default values. For example:

<fieldset name="lineItem" repeat-number="4">
<legend>Repeating groups of fields</legend>

<!-- field labels -->
<label for="item">Product Name</label>
<label for="quantity">Number Purchased</label>
<label for="unitprice">Price Per Unit</label>

<!-- first row -->
<input name="item" datatype="text" title="product name" value="a"/>
<input name="quantity" datatype="number" title="number purchased" value="10"/>
<input name="unitprice" datatype="number" title="price per unit" value="1"/>

<!-- second row -->
<input name="item" datatype="text" title="product name" value="b"/>
<input name="quantity" datatype="number" title="number purchased" value="2"/>
<input name="unitprice" datatype="number" title="price per unit" value="3"/>

<!-- any remaining rows will be generated dynamically so that there
     is always at least the number of rows given by repeat-number -->
</fieldset>

The repeat-index attribute can be used to identity which row the input, select or textarea element belongs to. The attribute value must be an integer in the range from 1 to the value provided with the repeat-number attribute. Note that the repeat-index attribute is always needed when initializing radio buttons other than the first row.

The user agent must ensure that radio buttons on repeated rows act independently for each row. In other words, selecting a button on one row shouldn't deselect a button on another row.

The user interface should provide the means to insert, delete and reorder rows, e.g. through controls placed at the end of each row. A new row may be automatically added when the last row is filled out. The object model provides methods for customizing the user interface. The minimum and maximum number of rows can be defined with the corresponding repeat-min and repeat-max attributes.

Object Model

The HTML4 forms object model is defined by the DOM Level 1 Recommendation. This specification provides incremental extensions to the HTML4 forms object model.

Field States

The CSS classes for fields (and fieldsets) are dynamically updated to reflect the field's current state. The classes used for this purpose include: focus, invalid, irrelevant, missing, readonly, disabled and overridden. Regular classes are used to allow implementation on existing browsers through a compatibility library written as a Web page script.

The CSS3 Basic UI Module defines a range of properties, psuedo-classes and pseudo-elements that can be used to style user interfaces, but requires native support.

The field's overridden property is a boolean value that when true signifies that the user has entered a value that overrides the calculated value.

Further work is needed to document how other field states are exposed via the DOM, including an indication of the reasons why a field has been found to be invalid, for use in customizing error reports.

Field Collections

Named fieldsets are included in the form's collection of named fields. If there are multiple fields with the same name, then the name resolves to an array of these fields.

Each fieldset defines a collection of the fields and fieldsets within its content (unless contained by a nested named fieldset). If there are multiple fields with the same name, then the name resolves to an array of these fields. This enables path based references to fields from within expressions.

Field Labels

The labels property for each field is an array of its labels as bound through the label's for attribute.

The title attribute is copied between fields and their labels to enable the user interface to provide a tool tip when the pointing device "hovers" over the field. The copy operation avoids the need for the Web page author to duplicate the title attribute on both field and label. On Internet Explorer, the DOM1 htmlFor property has to be used as getAttribute("for") doesn't work.

Short cuts

Input fields should cancel bubbling of keystroke events to avoid activating application defined short cuts when the user is typing into an input field.

Updates

Required fields are checked when the form is about to be submitted. Calculated fields, validity and relevancy must be updated upon changes to field values (changed event) and may be evaluated on a per keystroke basis, e.g. when typing a text string subject to a regular expression constraint.

Custom event handlers set with the HTML onchanged and onsubmit attributes must be executed before any expressions defined with the constraint, relevant, required and readonly attributes.

When multiple event handlers are set via the DOM for the same event on the same element, the execution order of these handlers is undefined. [See definition of IE's attachEvent and the W3C DOM addEventListener]

Expressions

There are a few predefined functions:

count(fieldset): counts ticked checkboxes within that fieldset
sumover(fieldname, expression): sum the expression over a named column in a repeated dataset

Author defined functions may be used, e.g. for the price of named products. An authoring framework may permit such functions to be generated automatically from a declarative data binding.

Field names can be simple e.g. "total" or compound e.g "shipto.city". For compound names, all but the last part must be the name of a fieldset element.

There are some subtleties to setting up dependencies, which need documenting.

Empty fields are handled specially, e.g. for validation and calculations. This avoids empty fields being marked as invalid, and ensures that calculated fields are empty if one of their dependee fields is empty.

Expressions may use any of the ECMAScript operators and string methods. The result of evaluating an expression is coerced to the appropriate type, for instance, boolean for required and relevant attributes, and to the type of the field that owns them for calculated fields, for example, to numbers for numeric fields.

The use of ECMAScript expressions for field names constrains those names to be valid ECMA 262 identifiers. This precludes, for instance, names containing hyphens and periods.

Grammar for expression syntax

This is a BNF grammar for the subset of ECMA 262 r-value expressions that can be used with XForms Transitional.

expr ::=
  (expr)
  prefixop expr
  expr infixop expr
  expr ? expr : expr
  string
  number
  true
  false
  [form.][fset.]*field
  function([expr[,expr]*])
  array\[expr[,expr]*\]

prefixop ::=
  +
  -
  !

infixop ::=
  +
  -
  *
  /
  %
  &&
  ||
  <
  <=
  ==
  !=
  >
  >=

form ::=
  This must match the value of a name attribute
  for a form element

fset ::=
  This must match the value of a name attribute
  for a fieldset selement

field ::=
  This must match the value of a name attribute
  for an input, select or textarea element

function ::=
  Literal name of a function defined in global scope

arrray ::=
  Literal name of an array defined in global scope

number ::=
  An ECMA 262 numeric literal, e.g. 42 and 1.4142

string ::=
  A quoted ECMA 262 string literal e.g. "foo" and 'bar'

true ::=
  ECMA 262 term for boolean true value

false ::=
  ECMA 262 term for boolean false value

Note that functions must be side-effect free and avoid introducing dependencies on form fields other than those passed via the function's arguments. This is needed to ensure that the expression can be statically analysed.

Examples

Here is a list of demonstrators for different aspects of XForms-Transitional using an experimental implementation of XForms Transitional as a cross-browser ECMAScript library.:

You can view the same examples using Web Forms 2.0 which has a more procedural approach to representing form logic. This makes it difficult to automatically generate server side validation scripts from Web Forms 2.0 pages. By contrast a purely declarative approach lends itself to machine reasoning, and makes it easier to create high level authoring tools.

The library and style sheet are available free of charge under W3C software licensing policy. The size of the library can be reduced by compressing it with gzip. The web browser is able to automatically decompress the file. Running Douglas Crockford's jsmin on the script before compressing it further reduces the library download size to around 6 KBytes (minified version).

Work is underway to reflect changes as you type without waiting for the onchanged event. Other ideas include customizable support for errors and warnings, adding and removing rows from repeating field sets, and the use of Ajax for partial updates to the form's data and Web page content. Your feedback would be much appreciated. My contact details are at the bottom on this page. Comments are also welcomed on the W3C public discussion list for forms.

Implementation Details

This is a collection of notes relating to an experimental implementation of XForms Transitional as a cross-browser ECMAScript library.

General Comments

Each field has properties corresponding to the pattern, calculate, constraint, required and relevant attributes. See below for the details of how expressions are prepared for dynamic execution.

The relevant property for each form is initialized to an array of the all fields in the form that each have a defined relevant property. If there are multiple fields with the same name, then the name resolves to an array of these fields. This is used to update relevancy after changes to field values.

For each field, the dependencies and dependees properties are initialized to arrays of the corresponding field dependencies due to field calculations. If a field B is calculated based upon the value of field A, then B is dependent on A, while A is a dependee of B. This is used to support a topological sort of field dependencies to determine update order when a field's value is changed or when the form is loaded or reset. Note that dependencies do not take into account author define functions called from expressions. Such functions should be side-effect free and avoid accessing fields except for those passed as parameters to the function.

Most browsers generate the onchanged event when you activate radio buttons and checkboxes. An exception is Internet Explorer which only generates the onchanged event when the focus is removed from the field. A workaround is to add an onkeydown event handler to simulate the onchanged event.

The focus/blur events are used to dynamically set/reset the focus class to allow the style sheet to highlight the field with the input focus. This is not done for radion buttons and checkboxes to avoid rendering problems with button backgrounds on some browsers.

Preparing Expressions

Expressions are copied from the corresponding attributes and prepared for execution via the ECMAScript eval function. The attribute values are not modified in this process. The analysis uses regular expressions to recognize references to fields, taking care of predefined functions etc. Each field reference is rewritten according to its type. The following uses 'name' as a macro for the field's actual name:

Radio buttons	radioValue(form.name)
Numeric fields	eval(form.name.value)
Date fields	(new Date(form.name.value))
Checkboxes	(form.name.checked == true)
Fieldsets	form.name
Others	(form.name.value)

A special case is where the reference is to an array, which occurs for repeated fields with the same name. In this case, the substring "[$]" is appended to the name.

When expressions are dynamically evaluated, form is a local variable for the form for the field that owns the expression. Likewise $ is a local variable that gives the array index for the owning field when the field is one of several with the same name, as is the case for repeated fieldsets.

Pending implementation work

The following features are awaiting implementation:

Recursive descent parser for expressions that raises an exception for expressions that don't conform to the defined grammar. I currently use a lighter weight approach that is less rigorous in its error checking.
More flexible parsing of date types, rather than limiting users to the syntax understood by the ECMA 262 Date object.
Support for the time data type. This should be fairly simple to add with a parser based upon regular expressions.
Support for using the step attribute on number, date and time fields, and exploration of an alternative mechanism that only effects the presentation.
Expressions for whether a field is read-only. Current browsers are likely to just look for the presence of the readonly attribute, so attribute renaming is likely to be needed, and the Opera attribute bug may also be a complication.
Support for marking calculated fields as having been overridden by user input. This should be easy to add. Note that overriding a calculation should dynamically remove the dependencies on other fields incurred by that calcuation.
Support for using radio buttons as part of repeating fieldsets. The easy fix would be to translate these to select elements. The harder approach would be to create a psuedo select element and to rename the radio buttons to avoid interaction across rows. These would be disabled just prior to submission to avoid their inclusion in the submitted data.
Support for the repeat-index (row) attribute for more flexible initialization of repeating fieldsets.
Support for adding/deleting/re-ordering rows in a repeated fieldset.
Support for price attribute on option elements. I already have a rough solution working for input elements.
Support for relevance attribute on other elements for use in dynamic summaries.
Support for wizards via state attribute, type="next" and type="previous".
Support for richer error reports via a message element

Some of these features are described in the next section and are subject to change according to the experience gained during their implementation.

Further Considerations

Separating presentation and data for form controls

A major failing of HTML4 was the lack of separation between the valued presented by form field controls and the internal values exposed through the DOM and submitted to the server. This is provides a challenge when it comes to offering a compatibility library for existing browsers. That may sound like a small detail, but consider the point of view of a website developer when faced with features that are supported on perhaps 10% of browsers compared with features that are supported on 99% of browsers. In particular, this effects date and number controls.

Spreadsheets allow you to limit the number of digits shown after the decimal point without so constraining the accuracy the number is held internally and which is used for dependent calculated fields. This is also related to the localization of how numbers are presented, e.g. thousands separator, and comma vs period for the decimal point. How to specify these two things? As attributes and properties, but what formats? How to support this on existing browsers? For dates, it is desirable for users to be able to enter and view dates in a format that suits their expectations. For servers, it is desirable to have dates submitted in as few formats as possible. The obvious choice is ISO 8601. Experimental work is underway to show the feasibility of emulating the separation between presentation and submitted data on existing browsers for both number and date field types.

Greater flexibility for binding fields to forms

One of the drawbacks of HTML4 is that fields must be included within a form element in order to be part of that form. This can cause problems when combined with the use of tables for layout purposes. However, it is generally better to use CSS for layout which avoids this problem. It has been proposed that fields should be able to list the names of the forms to which they belong. There are two difficulties with that. Firstly, it introduces ambiguity as to which form a field named from within an expression belongs, and secondly, experiments reveal that some common browsers (e.g. Firefox) won't submit fields added to a form by a script. This gets in the way of using a script to interpret a form attribute that names the forms the field belongs to, and thereby violates one of the core assumptions that this specification is based upon.

Customization of error reports and other features

Further work is needed to allow for flexible customization of error reports, e.g. as summaries or inline with the fields. This could be based upon a mix of markup and scripting. At its simplest, this is just a matter of providing events that can be used to attach error handlers, e.g. when a field is invalid, and when required fields are yet to be filled out, upon an attempt to submit the forms. The latter is cated for with HTML4's submit event, but a new invalid event would be valuable.

The title attribute can already be used to provide a tooltip for the field and label. Greater flexibility would be achievable by introducing a new element whose content is the error message. The element could be bound to the field in the same way as for labels. The use of an element also makes it possible to use CSS to style and position it as needed. It would be hidden until it needed to be shown. This shouldn't require any scripting. For that reason, it would be desirable to be able to give different messages for different kinds of errors. This suggests the use of an attribute to indicate which error the message is for. A further attribute could determine whether the message is shown when the field is found to be invalid, or only when the user tries to submit the form.

Associating prices with choices

For web pages used for ordering something online it is common to present a set of choices where each choice is associated with a particular price. One solution is to define a function that maps the choice to the price, but this involves the need to write some script. It would be nice to define a declarative mapping instead. The simplest idea is to add an attribute "price" and for there to be a predefined function that gets its value, for example:

<input name="memory" type="radio" value="512Mb" price="56.29">
<input name="memory" type="radio" value="1Gb" price="94.16">

...

<output calculate="price(base)+price(memory)+price(drive)+price(warranty)">
</output>

The name "price" might be a little too specific to this one use case and a more general name might be better, but what though? The pizza demo involves a base price depending on the pizza size and a per topping price that also depends on the pizza size. The radio buttons for the pizza size could perhaps have two attributes, one for the base price and the other for the topping, but how obvious would this be to the web application author? The demo was taken from a real website and the price happened to be independent of the crust thickness, but in principle, it could have depended on this. It is an open question as to whether it is practical to represent such mappings without involving the author in defining functions in a scripting language. The common case is where the price for an option is independent of other choices, so it is probably worth preserving a simple solution for that purpose.

Note that for my implementation it proves easier to support expressions with the syntax $name rather than price(name). This is probably too idiosyncratic a notation to be widely accepted, but I could be wrong!

Wizards and summaries of choices

For a web page acting as an order form for a personal computer system, the form might start by offering you a choice of base systems, e.g the processor and motherboard. Subsequent steps would take you through the choice of how much memory, the graphics accelerator, size of hard drive, optical drive, display size and so on. At each step, the choices presented may depend on what you selected earlier on, for instance, the kinds and amount of memory available may depend on the base configuration. The form also provides you with a summary of the choices you have made, perhaps as a running total for the price and a bullet list of features. This should all be possible in a singe web page without the need for scripting.

Wizards can be modelled as a sequence of states, where for each state the relevant parts of the form are revealed. This can be simply represented using a hidden field whose value identifies the current state. Other fields can then use the relevant attribute to hide and reveal them as appropriate, taking the state and the values of other fields into account. We now need a way to express the rules for moving to the next state. There are several common cases:

the initial state when the form is first loaded
the user has filled out all choices in a group
the user has pressed the "next" or "previous" button
the form has been submitted and awaiting the server's response
the form has been reset

For button's we could define an attribute indicating the next and previous states, e.g.

<button type="prev" state="video">Prev</button>
<button type="next" state="drive">Next</button>

where a button labeled "Next" takes you to the next part of the order form which deals with selecting the kinds of drives you want with your computer. The "state" attribute could be used with a literal string, but it would be more flexible to use an expression as this would allow the next state to depend on what choices the user had made. This would be at the cost of a slight complication for the simple case due to the need to quote the string literal, e.g. "'drive'". The action associated with type="next" would only be taken if all of the required fields that are currently relevant have been filled out. This restriction doesn't apply to type="previous". Additional constraints can be associated with actions by adding a constraint attribute.

When ordering a product with lots of options it is nice to be able to see a summary of what you have chosen. This is especially important for situations where the form is too large to view all sections at the same time. This raises two points. The first is how to make it easier to create forms that present each section of the form, one by one, and second, how to generate a summary of the choices made so far.

The summary could just be a list of bullet points or paragraphs that are shown when they are relevant. The simplest way to arrange that is to use an attribute to bind the element to a checkbox, radio button or option in a select element, e.g. through an idref value. A more flexible approach would be to reuse the existing relevant attribute. One detail with that is how to identify the form. This is either the enclosing form element, or the expression could give the form name as in the following example:

<li relevant="order.memory == '1Gb'">Memory - 1 GB - SO DIMM
200-pin - DDR II - 400 MHz / PC2-3200</li>

where "order" matches the name attribute on the form element and is only needed when the list item isn't enclosed by that form element. Note that the relevant attribute can be used on li, tr, p and div elements in addition to form controls.

Server-side library for no scripting case

If scripting is turned off, the scripting library won't be able to apply the validation constraints or to refresh calculated fields. A work around is to provide a submit button to allow these function to be carried out by the server. To ensure that this button is only visible on browsers with scripting turned off, you should enclose the button in a noscript element, e.g.

<noscript>
  <button name="refresh" value="now">Refresh Form</button>
  <input type="hidden" name="noscript" value="true" />
</noscript>

The hidden field allows the server to determine that scripting has been disabled, and hence change the behavior it applies to submitted forms. A open source scripting library module could be provided to apply the semantics of the XForms-Transitional features used on the page that submitted the form.

This would include inserting additional buttons as needed to invoke date controls or to toggle combobox controls between the list and text input modes. This avoids the need for extensive use of noscript in the original markup.

Miscellaneous Ideas

Other ideas of interest include:

autocomplete hint
autofocus
inputmode
output element
tree controls
co-related quantities
forminput event
datalist element

Acknowledgements

This specification is a synthesis of ideas from HTML4, Web Forms 2.0 and XForms, and started as an experiment to study the feasibility of providing incremental extensions to HTML4 forms that minimize the need for scripting compared to Web Forms 2.0 and which aims to provide the benefits of the XForms bind constraints as simple attributes on form fields rather than as separate elements as is the case for XForms.

References

HTML4: "HTML 4.01 Specification", D. Raggett, A. Le Hors, I. Jacobs, 24 December 1999.
HTML DOM: "Document Object Model (DOM) Level 1 Specification", Lauren Wood et al., 1 October 1998.
ECMA 262 (ECMAScript): "Standard ECMA-262 ECMAScript Language Specification", Standard ECMA-262, December 1999.
ECMA-357 (E4X): ECMAScript for XML (E4X) Specification, December 2005
Web Forms 2.0: Latest Draft from the WhatWG
XForms: See http://www.w3.org/MarkUp/Forms.

Dave Raggett <dsr@w3.org>

Last updated $Date: 2008/01/17 16:24:50 $