Rules for extending a WWW client: The Symposia API

Jean Paoli, Technical Director, GRIF S.A., 2 Boulevard Vauban, BP 266, 78053 St Quentin en Yvelines, Cedex, France

Jean.Paoli@grif.fr

Abstract:
There is a great need for WWW clients to be extensible. The availability of the source code of some popular browsers (Mosaic) led many people to slice the original Mosaic or CERN code and to add diverse custom code for specific applications.

In our view, a WWW authoring/viewing environment must be extensible enough to allow the building of interactive document authoring environments in which the user is able to access all relevant documentary information on the Web and incorporate it directly in his/her document.

Symposia (shipping since March 95) is a joint INRIA / GRIF S.A. project for building a cooperative WYSIWYG authoring tool for the WWW. Symposia will soon be shipped with an API that we have developed that presents a set of solid principles for extending the user interface, document management, network extensibility and interactive behavior of document fragments in a WWW client.

We will discuss in this paper the advantages gained from basing the extensibility of a WWW client on a generic structured environment. We will present different ways proposed today to extend WWW clients: Forms/CGI and Java and will compare them with the Symposia API.

Keywords:
Extensibility, Authoring, Symposia, Structured Documents, SGML, API, Document Oriented user Interfaces.

Table of contents

1 Symposia

2 Extensibility criteria

3 Extensibility in Symposia

4 Some Applications

5 Related methods for extensibility

6 Conclusion

7 Acknowledgements

8 References

9 Biography: Jean Paoli

1 Symposia

Symposia is a WYSIWYG authoring tool that enables you to create and modify HTML and SGML documents directly on the WWW. Symposia was developed by GRIF S.A. and INRIA as part of the European effort for the creation of more powerful tools for the World Wide Web.

Symposia is built on top of the Grif WYSIWYG SGML editor that allows files based on any DTD to be loaded, edited and saved in native SGML format. Functions for handling each of the different HTML versions have been incorporated into the Grif software and the SGML parser used has been modified so as to accept HTML documents which are not strictly valid.

Thanks to the expandability of the Grif software, the CERN/W3C network library has been integrated with the SGML Editor and OpenURL and SaveURL commands have been added by using the PUT element of the http protocol. This allows documents to be created and saved directly on remote servers. Various cooperative strategies have been studied to allow collaborative authoring on the WWW and a simple strategy (lock/unlock file) has been developed [Paoli 95] .

Special user-friendly editing (click and point) commands for creating and modifying anchors have been written and links can be followed immediately after being created through the network. The tool accepts each of the HTML `dialects' and any other SGML DTD could be incorporated in the tool using the standard features of the Grif environment. This would allow the creation, editing and remote saving of documents by multiple authors on the network in their original SGML format.

Further work is already underway in the field of collaborative authoring on the WWW, including the incorporation of annotated documents, handling large documents, support for different versions and the interactive incorporation of various fragments of data from different servers into one document.

A freeware version of Symposia can be downloaded from the INRIA WWW server at http://symposia.inria.fr . A Pro version is commercialized by GRIF S.A.

2 Extensibility criteria

The design of user interfaces to access or modify data on the WWW raises the need for WWW clients to support new data type handling, configurability, programmability and open-endness:

Extensibility is said to be dynamic if extensibility code can be loaded, installed and executed directly without re-building the software. It is said to be static if recompilation or rebinding of the software is needed to access the extended functions.

In WWW clients, extensibility could also more specifically address on imaging or text processing, depending on the type of application needed.

We will examine in detail in section 3 the characteristics of the extensibility features of Symposia. We will give in section 4 a few examples of applications which extend the functionalities of Symposia on the WWW, and will compare in section 5 the extensibility methods of Symposia with the major ones proposed today for WWW clients. Finally, we will outline in section 6 the characteristics of the extensibility of Symposia with regard to the extensibility criteria presented here.

3 Extensibility in Symposia

3.1 Handling new tags

Handling new tags in a WWW client raises the need to consider how to extend the client to parse, format and edit new tags.

Symposia is built on top of the Grif WYSIWYG SGML editor which handles generic structured documents [Quint 95] .

Structured documents are documents which are internally organized according to their content. They follow Document Type Definitions which specify the logical elements which can be used in a document and how these elements may be arranged in a hierarchical way. New tags (represented syntactically by a start tag and an end tag) are manipulated as elements and can themselves contain other elements.

Incorporating new tags in Symposia involves adding the new tags to the HTML DTD supported by Symposia and compiling it using an external tool named Grif Application Builder. Grif Application Builder generates multiple parametrization files which enable Symposia to parse and edit the new tags.

In an approach which represents information and data as documents, we think that it is important to develop and make extensive use of SGML to formalize these documents [Sperberg 94] . Documentary data are more clearly identified by semantic tags [Paoli 94] . Semantic tags are used to precisely identify corporate or industry specific information such as motors, product parts, transistors, or other objects which require a very precise description. This is one of the strong points of SGML and there is always a need for mission-critical data to be formalized and stored using such markup.

Because Symposia makes use of Grif SGML Editor's system for handling generic structured documents, incorporating new DTDs in Symposia is simply a matter of following the process previously described.

3.2 User Interface

Because the document concept plays such an important role in the WWW, it is only logical that extensibility rules for a WWW client consider carefully the extensibility of the user interface, not only around a document (for tailoring menus or dialog boxes), but essentially within the document itself.

In fact, a lot of the interaction between the end user and WWW servers is done through documents. For example, in Mosaic or Netscape:

In the above examples, the integration of multiple tools is done in a seamless way for the end user: data is generated within a document and sent from tool to tool, from server to server. The document is used as the natural vehicle between users and computers [Bier 90] .

This kind of user interface is called Document Oriented user Interface (DOI): basic user operations on the document launch tools which operate distinctively on the selected portion of the document. We call this the document paradigm because complex applications could be disguised as document component behavior.

This user interface philosophy has been endorsed by major companies: Microsoft is increasingly comitted to OLE and Apple, IBM, Novell and others are creating and supporting OpenDoc. OLE and OpenDoc, although there are some differences between them, are both basically aimed at providing Document Oriented user Interfaces.

WWW clients must allow us to build, to develop, and make extensive use of this user interface approach.

A WWW client can be likened to an interactive document environment in which the user has everything to hand to enable him to access relevant documentary information and to interact with other processes. The WWW client must constantly rearrange the data resulting from these interactions in such a way as to present the information intelligently to the user.

This environment is characterized by:

This environment reinforces the notion of content. Individual applications become relatively less important because many of them are used to build a single document.

An analysis of the key chararactistics of such environments shows that:

Two different but mutually complementary approaches could be considered:

Symposia implements the second approach. Both approaches are, however, complementary.

3.3 Activity tracking

Implementing Activity Tracking

Extensibility of a WWW client means that multiple tools become available together, even available in a single document. To be able to use structured documents (by scripting them) as user interfaces the issue of how to specify behavior (or 'activity tracking') in a document becomes the key issue [Terry 90] .

What is generally referred to as 'writing scripts' is composed of 3 subjects:

We will address here more specifically the first subject.

Events must be clearly separated between user interface events and semantic events. User interface events are events that pertain only to the state of the graphical user interface, not directly to its content. User interface events include mouse-clicks and keystrokes. For example, VisualBasic identifies Click, DblClick, DragOver, GetFocus, MouseDown, etc.

Semantic events pertain directly to the data content model.The list below contains just a few examples of semantic events:

The definition of these events should contain the list of each event accompanied by their parameters: Applications should receive each individual edit separately and have access, in parameters, to the place and the context where the modification occurred.

Activity Tracking in Symposia

In Symposia, our approach to activity tracking is an object-oriented approach for tailoring the behavior of SGML elements: SGML elements receive event messages reflecting the user interaction and in response they execute an appropriate action. Actions are written in C code (see section 3.4).

Basic user interaction generates messages but the most important is that structure and content changes of the elements generate messages. The supported message list contain almost all of the SGML ESIS events related to the creation, modification of SGML elements and attributes. Content modification such as PCDATA text modification also generates messages to the appropriate element.

Associating Messages to Structured Documents

The I (Interface) language was developed to allow the binding of messages to elements, to specify which actions should be called, and to define new messages and menu items.

The syntax for specifying the parametrization of the behavior of SGML elements is [Quint 94] [Paoli 94] :

ELEMENTNAME:
        Message1: Action1;
        Message2: Action2;

This indicates that the action Action1 has to be executed when the event message Message1 is sent to the element named ELEMENTNAME .

Messages can be associated to structured documents on an element basis.

Messages are bound to elements in an I file:

Each element records its interest in receiving a message in the RULES section.

An I file contains different sections:

APPLICATION
Defines which DTD the application is designed for, e.g. HTML2.0.
DEFAULT
Defines defaults actions bound to messages for all elements and attributes of the DTD.
RULES
Defines a set of messages and actions for each element.
ATTRIBUTES
Defines a set of messages and actions for each attribute.
MENUS
Defines all the menus and menu items presented in the menu bar, and a message for each item, on a DTD basis.

By using one or multiple I files, a specific extension could be activated only on a certain type of elements and on a specific type of user interaction (such as a double click on the element or the creation or the deletion of another element).

Standard Messages Availables
The section gives a few examples of the standard messages which are provided in Symposia:

Messages on elements:

Messages are sent when a menu item is activated. Examples of this type of message include:

StdCut
Indicates that the Cut item of the Edit menu has been activated.
StdExit
Indicates that the Exit item of the File menu has been activated.
Messages are also sent following any user action other than activation of a menu item. Examples of this type of message include:
StdSelect
Indicates that an element in the document has just been selected.
StdTextModify
Indicates that the modification of a text element (by inserting or deleting characters) has just been ended.
StdFollowLink
Indicates that the user has double-clicked (or another standard interaction) and intends to follow a link starting from this element.
StdCreateNew
Indicates that the element has just been created by the user using the standard Insert dialogbox or by a Carriage Return.

Messages on attributes:

Messages are sent when an attribute is set on or removed from an element using the standard Attribute dialogbox. Example of such messages include:

StdAttrCreate
Indicates that the attribute has just been set on the element using the Apply button in the Attribute dialogbox.
StdAttrModify
Indicates that the value of the attribute has just been modified using the Apply button in the Attribute dialogbox.
StdAttrDelete
Indicates that the attribute has just been removed from the element using the Remove button in the Attribute dialogbox.
Replacing or Refining Standard Edit Functions

Each action attached to a message could replace, precede or be executed after the corresponding standard Symposia edit function.

Thus, standard messages could be used as follows:

Message.Pre :TheAction
The action precedes the Symposia standard edit function.
Message :TheAction
The action replaces the Symposia standard edit function.
Message.Post:TheAction
The action follows the Symposia standard edit function

Using this syntax for a specific element, one could choose to override or to refine the standard Symposia editing command.

Other extensibility features

It is possible, by using a set of API functions, to extend the user interface of Symposia by adding menus or dialog boxes. In this case, new messages are also defined for each new menu, and activity tracking is extended to these new elements of the user interface.

Other approaches to activity tracking

SGML

The SGML standard (ISO 8879) defines what is valid input to a parser and as such defines what an SGML parser must do. An SGML parser receives parsing events (ESIS) which are not sufficient to define activity tracking. This comes from the fact that parsing is commonly viewed as a batch operation and that what we are addressing here is interactivity: the way ESIS was defined is more compatible with the concept of batch parsing than with the interactive construction of SGML data.

HyTime

The base module of the HyTime standard [DeRose 94] contains a section (Base module, 6.5.7) on activity tracking policy (attribute list form all-act ). Six possible activities are described for an object : create, modify, link, access, unlink, delete. When considered together with the ESIS definition of low-level parser events which enable the parser to recognize markup constructs, we have here a framework which allows us to establish a good definition of activity tracking.

OpenDoc

In OpenDoc [OpenDoc 94] , scriptability is considered as a key issue and there is full support for the definition of semantic events, based on the content model of a Part. It would be interesting to investigate whether OSA could support the SGML content model.

3.4 Data tree handling

Symposia is built on top of the Grif WYSIWYG SGML editor which handles generic structured documents. Data are handled internally following the principles of structured editing and are represented as a set of in-memory tree constructs. These tree representations are updated incrementally when the user modifies the document.

Writing Actions Associated to Messages

As we said in section 3.3.2, to extend Symposia, actions are associated to messages in an I association file. These actions are executed in response to a user interaction and are implemented by the writer of the extension as C functions with the standard action function signature, as here:

void ExtensionAction(Element element)
{
/* Place user code here          */
/* Possibility to access         */
/* internal Sympiosia Structures */
/* using Symposia API            */
/* Could also call http/network  */
/* functions and retrieve             */
/*          external data        */ 
}

The writer of the extension can call the Symposia API which enables him/her to access the HTML/SGML fragment, to read its content, to modify it, or to insert other fragments.

Services provided by the Symposia API

The Symposia API provides a programming interface to the HTML/SGML structure and content. The API supports element and attribute creation and manipulation, content modification, structural searches, and incorporation of fragments into the document.

Tree handling

The API includes functions that make it possible to create a new element, to modify its content, to create an element from a HTML/SGML fragment, to move through the tree structure, to search through the tree structure, or to define an element in the document as read only:

Element GtNewElement(document, elementType);
Element GtSetTextContent(element, content);
Element GtOpenBuffer(document, dtdname, buffer);
Element GtGetFirstChild(parent);
Element GtSearchTypedElement(searchedType, scope, element);
Element GtSetAccessRight(element, right);
Attribute handling

Other functions in the API make it possible to create a new attribute, to set its value and to attach it to an element:

Attribute GtNewAttribute(attributeType);
void GtSetAttributeValue(attribute, value, element);
void GtAttachAttributeContent(element, attribute);
Listeners handling

The API includes functions which permit to register a new source of input such as sockets or fifo:

void GtRegisterListener(fd, callback);
View handling

The API includes functions to handle the multiple dynamic views defined by the Symposia style-sheet mechanism, such as opening a view containing a list of all the anchors in a document.

View GtOpenView(document, viewName);

3.5 Network handling

Symposia, as an authoring tool, is a WWW client and is wired on the network. Symposia has been designed to facilitate the creation and the maintenance of online data published directly on the Web. Symposia incoporates the CERN (now W3C) network library, and by using the Open URL and Save URL (using the PUT element of the http protocol) commands, one can directly follow a link or save a document on a remote server. For all these reasons, it was important to permit a great flexibility of the network handling in Symposia.

The Symposia API uses a very short list of functions implemented on top of the network library:

The set of functions that call network services is:

The writer of an extension could call, in the C action code, these API functions to load over the network the content of a URL or to save remotely in a URL the content of a string buffer (which can contain for example HTML/SGML fragments of data).

The set of functionswhich has to be implemented in the network library to call Symposia for feedback is:

The extensibilty of Symposia, with regard to network handling, is based on two characteristics:

4 Some Applications

Online applications could be envisioned that make use of the extensibility features of Symposia for both the authoring or viewing process. These applications could be based on an HTML environment or on another SGML DTD.

For example, when creating manuals, an author often needs to gain access to information which has been created previously. This may involve taking fragments of data from various sources and integrating them into a single document.

The viewing process might require that the viewing tool provide the response to a user query regarding information stored on the network. In order to be accurate, the reply provided by the viewer should take account of certain constraints such as data contained in the document. The query could then be refined if such data was encoded within the document as SGML data.

The two applications which follow are examples of applications which have already been implemented using Symposia.

HTML 2.0 Forms
The HTML 2.0 support for Forms (editing and viewing) has been implemented in Symposia thanks to its extensibility:
<!-- The form HTML2.0 definition contains multiple inputs-->
<!-- Inputs attributes have been turned in memory-->
<!-- to elements to facilitate interactive editing-->
<!-- Inputs are read/written in their original form-->
 
<!ELEMENT form - - 
(Radio_Input,... > <!ATTRIBUTE Radio_Input CHECKED (CHECKED) #IMPLIED)>
The description of the presentation of the radio element is expressed in the P presentation (style sheet) language of Symposia which can modify the presentation of an element The description of the behavior of the radio element is expressed in an I file as follows:
Radio_Input:
         StdFollowLink: HTML2RadioAction;

The action HTML2RadioAction is executed when the message
StdFollowLink (A Double click of the end user) is sent to the element Radio_Input .

HTML2RadioAction(Element element) {
/* This is written by calling the C structured API*/
/* Of Symposia on the HTML in memory instance */
For all elements Radio before or after element
 {Search for attribute CHECKED
  Remove the attribute CHECKED}
Create and Set the attribute CHECKED on element
}
Parts Lists
The previous example was built in the HTML environment. We give here another example using an SGML environment that modelizes an illustrated parts list catalog.In an illustrated parts list catalog, keying the content of a part reference could automatically query a remote server for the part description, while keying the same content in the title of the document would do nothing. A Database menu could also be available for the part reference to give, through the network, the list of valid choices.
<!-- A Part List contains multiple block items-->
<!-- which describes parts and assemblies     -->
<!ELEMNT partlist (blockitem)*>
<!ATTRIBUTES partlist URL %URL>
 
<!ELEMENT blockitem - - 
(supplier, supplierref, partref,                   type, expire?, quantity) >

The partlist has an attribute which indicates the URL of a database (or a CGI script) which gives back the blockitem SGML fragment corresponding to a particular partref.

The description of the behavior of the partref element may be expressed as follows:

partref:
StdMsgTextModify: ApplicationTextModify;

The action ApplicationTextModify is executed when the message
StdTextModify (text has been modified) is sent to the element partref .

ApplicationTextModify (Element element) {
Fetch The Value of PARTREF
Move up to the partlist element
Fetch its URL
Query through the network with the URL and the PARTREF 
for receiving the corresponding BLOCKITEM
Move to the element SUPPLIER
Fill in this element from the Query Result
Move to the element SUPPLIERREF
Fill in this element from the Query Result
...
}

One have to understand that this action is executed (and a query is sent on the network) only when a user types in a partref number in a partref element.

5 Related methods for extensibility

5.1 CGI & Forms

The Common Gateway Interface (CGI) is a standard for interfacing external applications with information servers such as HTTP or WEB servers [NCSA 94] . A CGI program, written in any programming language such as C, or in interpreted languages like perl scripts could be installed on a WEB server and executed when a WWW client tries to access the program by its URL. The CGI program is then executed and its result is an HTML document which is sent to the WWW client.

Usually, CGI programs are executed through an HTML 2.0 Form tag [Berners 94] . HTML 2.0 Form tags are used as user interface objects in a WWW client to capture user preferences or user data and to execute on a remote server an attached CGI script with these data as parameters.

This method is very simple and works very well but has certain shortcomings:

The method used by Symposia offers multiple advantages but requires a certain amount of programming in C and is therefore less simple to implement.

5.2 Java and HotJava

Java [Java 94] is an object-oriented programming language used for the creation of distributed, executable applications. Java was defined by SUN Microsystems which has also implemented HotJava, a WWW browser that can execute applets. An applet is a Java program that can be included in an HTML page, much like an image can be included.

Java applets are an excellent example of extensibility and have already been implemented in a wide range of useful applications, and which present the following advantages:

However,

The greatest advantage of Java is undoubtedly the ease with which external applications are installed and accessed over the network. Symposia needs to evolve towards this kind of approach where the actions, currently defined in C, could be interpreted instead of compiled.

6 Conclusion

It is essential for WWW clients to be extensible if we are to be able to build integrated authoring environments on the WWW. These authoring environments must be adapted to the type of documents that we want to produce.

We have seen the advantages to be gained from basing the extensibility of a WWW client on a structured authoring environment such as that offered by Symposia, where data is clearly identified and where tools can access and work directly with these elements of data over the network.

By adopting a structured approach to information authoring and retrieval on the WWW, we can access and manipulate intelligently on both the client and the server sites the data which is semantically identified.

Such an approach enables us to build Document Oriented user Interfaces for the documentary data manipulated on the Web.

Symposia must evolve so that the installation and use of its extensions can be achieved with as much ease as is currently possible with the Java extensions.

7 Acknowledgements

We would like to thank V.Quint, I.Vatton from INRIA, L.Pedersen, B.V. Sydow and A. Slominski from the EUROMATH project and P.Telegone from GRIF S.A. for the very interesting and fruitful discussions which have allowed the definition of needs for the extensibility of Symposia. We would also like to thank Stuart Culshaw at Grif S.A. for his help in reviewing this paper.

8 References

[Berners 94] T.Berners-Lee, D.Connolly, 'Hypertext Markup Language Specifications - 2.0' , Internet Draft, http://www.w3.org/ hypertext/ WWW/MarkUp/ html-spec/ html-spec_2.html, May 1995.

[OpenDoc 94] The OpenDoc Design Team OpenDoc, 'the Required Reading Packet' , http://www.info.apple.com/dev/du/intro_to_opendoc/iod0_index.html, 1994.

[Quint 95] V. Quint, C. Roisin, I. Vatton, 'A structured authoring environment for the World-Wide Web', Proceedings of the Third Internationnal World Wide Web Conference , edited by Computer Networks and ISDN systems, pp. 831-840, April 1995.

[DeRose 94] S. DeRose, D. Durand, 'Making Hypermedia Work, A User's Guide to HyTime', Kluwer Academic Publishers, 1994.

[Paoli 95] J.Paoli, 'Cooperative work on the network: edit the WWW!', Proceedings of the Third Internationnal World Wide Web Conference , edited by Computer Networks and ISDN systems, pp. 841-847, April 1995.

[Bier 90] E.Bier and A.Goodisman, 'Documents as User Interfaces', EP 90, Proceedings of the International Conference on Electronic Publishing, Document Manipulation & Typography, R. Furuta ed., pp. 249-262, Cambridge University Press, September 1990

[Java 95] SUN Microsystems, 'The Java Programming Language', http://java.sun.comp , May 1995.

[NCSA 94] NCSA httpd Development Team, 'The Common Gateway Interface', http://hoohoo.ncsa.uiuc.edu/cgi, May 1994

[Paoli 94] J.Paoli, 'Creating SGML objects for End-Users - Establishing SGML in an interactive world', Proceedings of SGML "94 , GCA, ed., pp. 323-333, December 1994.

[Quint 94] V. Quint, I. Vatton, 'Making Structured Documents Active', Electronic Publishing - Origination, Dissemination and Design , vol. 7, num. 3, 1994.

[Terry 90] D.B. Terry and D. G. Baker, 'Active Tioga Documents: an Exploration of Two Paradigms', Electronic Publishing- Origination, Dissemination and Design, 105-122, May 1990 .

[Sperberg 94] C. M. Sperberg-McQueen, Robert F. Goldstein, 'HTML to the Max A Manifesto for Adding SGML Intelligence to the World-Wide Web', http://www.ncsa.uiuc.edu/SDG/IT94/Proceedings/Autools/sperberg-mcqueen/sperberg.html, October 1994.

9 Biography: Jean Paoli

Jean Paoli is the Technical Director and a co-founder of GRIF S.A., a leader in the creation of SGML authoring tools. He supervises the development and implementation of Grif's WYSIWYG SGML products, the latest being Grif SGML Editor for Macintosh and GATE, an interactive SGML API. Paoli manages GRIF S.A. application consulting groups as well as research and strategic planning toward the Grif technology.

He is currently driving a joint INRIA/GRIF S.A. project for the development of Symposia, a WWW editor which enable collaborative authoring on the network.

Paoli draws on more than 10 years of experience in the structured editing field. Before co-founding GRIF S.A., he worked on structured editors for programming languages with the leading French software house SEMA-GROUP and France's leading computing research institute, INRIA. Jean holds a specialisation in software engineering and is graduated from the Ecole Nationale des Ponts et Chaussées.