NOTE-widl-970922
This document is a NOTE made available by the W3 Consortium for discussion only. This indicates no endorsement of its content, nor that the Consortium has, is, or will be allocating any resources to the issues addressed by the NOTE.
This document is a submission to W3C from webMethods, Inc.. Please see Acknowledged Submissions to W3C regarding its disposition.
Copyright (c) 1997 webMethods, Inc.
This document provides the specification for the Web Interface Definition Language (WIDL), a metalanguage that implements a service-based architecture over the document-based resources of the World Wide Web. WIDL is an application of the eXtensible Markup Language (XML); it allows interactions with Web servers to be defined as functional interfaces that can be accessed by remote systems over standard Web protocols, and provides the structure necessary for generating client code in languages such as Java, C/C++, COBOL, and Visual Basic. WIDL enables a practical and cost-effective means for diverse systems to be rapidly integrated across corporate intranets, extranets, and the Internet.
The World Wide Web is providing millions of end-users access to ever-increasing volumes of information. While the Web initially served browsers static documents, the resources of legacy systems, relational databases and multi-tier applications have all been made available to the Web browser to provide corporate users with interactive information resources including financial services, inventory management, on-line purchasing and package tracking.
While the Web has achieved the extraordinary feat of providing ubiquitous accessibility to end-users, it has in many cases reinforced manual inefficiencies in business processes as repetitive tasks are required to transcribe or copy and paste data from browser windows into desktop and corporate applications. This is as true of Web data provided by remote business units and external (i.e. partner or supplier) organizations as it is of Web data accessible from both public and subscription based Web sites. The problem of direct access to Web data from within business applications has been largely ignored.
The purpose of the Web Interface Definition Language (WIDL) is to enable automation of all interactions with HTML/XML documents and forms, providing a general method of representing request/response interactions over standard Web protocols, and allowing the Web to be utilized as a universal integration platform.
A central feature of WIDL is that programmatic interfaces can be defined and managed for data (HTML, XML or text files) and services (CGI-bin, database, or other back end systems) that are not under the direct control of programs that require such access. WIDL definitions can be co-located with client programs, centrally managed in a client/server architecture, or referenced directly from HTML/XML documents.
WIDL defintions provide a mapping between Web resources and applications written in conventional programming languages such as C/C++, COBOL, Visual Basic, Java, JavaScript, etc., enabling automatic and structured Web access by compatible client programs, including mainstream business applications, desktop applications, applets, Web agents, and server-side Web programs (CGI, etc.).
Automatic means that complex interactions with Web servers do not require human intervention; programs can request Web data and services by making local calls to functions which encapsulate standard Web access protocols and utilize WIDL definitions to provide naming services, change management, error handling, condition processing and intelligent data binding.
Structured means that Web data and services are described as interfaces with well defined input and output variables.
Standard Web access protocols means HTTP and HTTPS.
Compatible means any program that both utilizes WIDL definitions to define the location of Web services and the structure of data that is returned by standard HTTP and HTTPS requests, and allows WIDL definitions to be managed locally, centrally, or by individual service providers.
WIDL describes business objects on the Web, providing the basis for a common API across Web servers, legacy systems, databases, and middleware infrastructures, and effectively transforming the Web from an access medium into an integration platform.
This document provides a complete description of the Web Interface Definition Language (WIDL).
A major part of the value of an Interface Defintion Language (IDL) is that it can define services offered by applications in an abstract but highly usable fashion. WIDL brings to the Web many of the features of IDL concepts that have been implemented in distributed computing and transaction processing platforms including DCE, and CORBA.
WIDL makes it easy for organizations to automate business transactions with customers and suppliers. WIDL describes and automates interactions with services hosted by Web servers on intranets, extranets and the Internet; it transforms the Web into a standard integration platform and provides a universal API for all Web-enabled systems.
Using HTML, XML, HTTP and HTTPS as corporate standards glue, WIDL requires only that target systems be Web-enabled. There are hundreds of products in the market today which Web-enable existing systems, from mainframes to client/server applications. The use of standard Web technologies empowers various IT departments to make independent technology selections. This has the effect of lowering both the technical and 'political' barriers that have typically derailed cross-organizational integration projects.
A number of analysts have already warned that proprietary e-commerce platforms could lock suppliers into relationships by forcing them to integrate their systems with one infrastructure for business-to-business integration, making it costly for them to switch to or integrate with other partners who have selected alternate e-commerce platforms. Buyer-supplier integration issues involve many-to-many relationships, and demand a standard platform for functional integration and data exchange.
A service defined by WIDL is equivalent to a function call in standard programming languages. At the highest level, WIDL files describe the locations (URLs) of services, input parameters to be submitted (via Get or Post methods) to each service, conditions for successful processing, and output parameters to be returned by each service.
WIDL provides the following features:
WIDL can be used to describe interfaces and services for:
WIDL can be used:
WIDL has the ability to specify conditions for successful processing, and error messages to be returned to calling programs. Conditions further enable services to be defined that span multiple documents.
One of WIDL's most significant benefits is its ability to insulate client programs from changes in the format and location of Web documents. Unlike the way CORBA and DCE IDL are normally used, WIDL is interpreted at runtime; as a result, service URLs, object references in variables, definitions of document regions, success/failure conditions, and directives for service chaining can all be administered without requiring modification of client code. This usage model supports application-to-application linkages that are far more robust and maintainable than if they were coded by hand.
There are three models for WIDL management:
WIDL does not require that existing Web resources be modified in any way. Flexible management models allow organizations to describe and integrate Web sites that are uncontrolled, as well as to provide their business partners with interfaces to services that are controlled. The ability to seamlessly migrate from independent to shared management eases the transition from informal to formal business-to-business integration.
The primary purpose of WIDL is integration of Web resources with corporate business applications. In much the same way that DCE or CORBA IDL is used to generate code fragments, or 'stubs', to be included in application development projects, WIDL provides the structure necessary for generating client code in languages such as C/C++, Java, COBOL, and Visual Basic. Developers can thus be insulated from the need to understand both HTML/XML parsing and Web protocols. This capability enables the existing skills of innumerable programmers to be rapidly leveraged in the utilization of Web based resources.
Many of the features of WIDL require a capability to reliably identify and extract specific data elements from Web documents. Various mechanisms for accessing elements of HTML and/or XML documents have been defined, such as the Javascript Page Object Model, the Document Object Model, and XLL XPointers. WIDL does not define or determine a mechanism for accessing document data, but rather allows an object model referencing mechanism to be specified on a per-interface basis.
The following capabilities are desirable for accessing elements of Web documents:
Object referencing mechanisms would ideally support both parsing and pattern matching. Pattern matching extracts data based on regular expressions, and is well suited to raw text files and poorly constructed HTML documents. Parsing, on the other hand, recovers document structure and exposes relationships between document objects, enabling elements of a document to be accessed with an object model.
The following example illustrates the use of WIDL to define a package tracking service for generic Shipping. By allowing a WIDL definition to reference a 'Template' WIDL definition, a general class of shipping services can be defined. 'FoobarShipping' is one implementation of the 'Shipping' interface.
<WIDL NAME="genericShipping" TEMPLATE="Shipping" BASEURL="http://www.shipping.com" VERSION="2.0"> <SERVICE NAME="TrackPackage" METHOD="Get" URL="/cgi-bin/track_package" INPUT="TrackInput" OUTPUT="TrackOutput" /> <BINDING NAME="TrackInput" TYPE="INPUT"> <VARIABLE NAME="TrackingNum" TYPE="String" FORMNAME="trk_num" /> <VARIABLE NAME="DestCountry" TYPE="String" FORMNAME="dest_cntry" /> <VARIABLE NAME="ShipDate" TYPE="String" FORMNAME="ship_date" /> </BINDING> <BINDING NAME="TrackOutput" TYPE="OUTPUT"> <CONDITION TYPE="Failure" REFERENCE="doc.title[0].text" MATCH="Warning Form" REASONREF="doc.p[0].text" /> <CONDITION TYPE="Success" REFERENCE="doc.title[0].text" MATCH="Foobar Airbill:*" REASONREF="doc.p[1].value" /> <VARIABLE NAME="disposition" TYPE="String" REFERENCE="doc.h[3].value" /> <VARIABLE NAME="deliveredOn" TYPE="String" REFERENCE="doc.h[5].value" /> <VARIABLE NAME="deliveredTo" TYPE="String" REFERENCE="doc.h[7].value" /> </BINDING> </WIDL>
In this example, the values defined in the 'TrackInput' binding get passed via HTTP Get as name-value pairs to a service residing at 'http://www.shipping.com/cgi-bin/track_package'. Object References are used in the 'TrackOutput' binding to a) check for successful completion of the service, and b) extract data elements from the document returned by the HTTP request.
WIDL enables common interfaces to services provided by multiple sites. Templates allow the specification of interfaces, implementations of which may be available from multiple sources. A shipping template defines a functional interface for shipping services; various implementations can be provided for Federal Express, UPS, and DHL.
<WIDL TEMPLATE="Shipping"> <SERVICE NAME="packageTrack" ... /> <SERVICE NAME="schedulePickup" ... /> ...
Service definitions support TIMEOUTS and RETRIES to handle situations when a Web server is responding intermittently. If a service does not complete within the specified number of seconds, it is tried again up to RETRIES times after which it fails.
<SERVICE NAME="schedulePickup" METHOD="POST" URL="http://www.fooShipping.com" INPUT="pickupInput" OUTPUT="pickupOutput" TIMEOUT="5" RETRIES="5" /> ...
'Input' and 'Output' bindings specify the input and output variables of a particular service. Input bindings define the name-value pairs to be passed via Get or Post methods to a Web-based application. Output bindings use object references to identify and extract data elements from documents returned by HTTP requests.
<BINDING NAME="TrackInput" TYPE="INPUT"> ... <BINDING NAME="TrackOutput" TYPE="OUTPUT"> ...
The FORMNAME attribute of a variable declaration enables the parameters of a service to be re-named. This feature is useful for defining multiple implementations of a service which require a common interface, yet must pass the proper name-value pairs to individual Web-sites where the services are hosted.
<BINDING NAME="TrackInput" TYPE="INPUT"> <VARIABLE NAME="TrackingNum" TYPE="String" FORMNAME="trk_num" /> ...
The values of variables declared as USAGE="Header" get passed as part of an HTTP header and are not included in the name-value pairs submitted to back-end services (i.e. CGI script) via Get or Post. Variable values may also be hardcoded by providing a Value.
<BINDING NAME="anInput" TYPE="INPUT"> <VARIABLE NAME="REFERRER" TYPE="String" VALUE="http://www.company.com" USAGE="HEADER" /> ...
The values of variables declared as USAGE="Internal" are accessible within WIDL declarations and are not included in the name-value pairs submitted to back-end services (i.e. CGI script) via Get or Post. In this example the value an input variable 'state' is used to complete the URL for the service 'AutoLoan'. Internal variable enable URL directory structures to be interactively 'queried'.
<SERVICE NAME="AutoLoan" Method="Get" URL="http://www.autoloan.com/%state%.html" INPUT="AutoLoanInput" OUTPUT="AutoLoanOutput" /> <BINDING NAME="AutoLoanInput" TYPE="INPUT"> <VARIABLE NAME="state" TYPE="String" USAGE="INTERNAL" /> ...
The REGION element defines an area of a document by specifying START and END object references. Named regions permit the extraction of data elements relative to other features of a document. Regions are addressed using object references that begin with the region name.
<BINDING NAME="NewsOut" TYPE="OUTPUT"> <REGION NAME="tops" START="doc.p[3]" END="doc.h[4]" /> <VARIABLE NAME=stories TYPE="String[]" REFERENCE="tops.a[].text" /> <VARIABLE NAME="links" TYPE="String[]" REFERENCE="tops.a[].href" /> ...
Conditions define 'success' and 'failure' states for output bindings, and determine whether a binding attempt should be retried in the case of a 'server busy' error:
<BINDING NAME="AutoLoanOutput" TYPE="OUTPUT"> <CONDITION TYPE="FAILURE" REASONTEXT="State not found" /> ...
<BINDING NAME="TrackOutput" TYPE="OUTPUT"> <CONDITION TYPE="SUCCESS" REFERENCE="doc.title[0].text" MATCH="Shipping Airbill:*" REASONREF="doc.p[1].value" /> ...
<BINDING NAME="getPrice" TYPE="OUTPUT"> <CONDITION TYPE="RETRY" REFERENCE="doc.headings[0].text" MATCH="*Server Busy*" WAIT="5" RETRIES="4" /> ...
Conditions can apply to a binding as a whole, or to a specific object reference. Conditions can define error messages to be returned as the value of the service; error messages can be a literal, or can be extracted from the returned document.
This mechanism can handle not only unexpected errors, but also can be used to map application-level errors from the back-end program (i.e. responses resulting from invalid or missing input values).
Conditions can direct a service to attempt an alternate binding for the extraction of output values:
<BINDING NAME="getPrice" TYPE="Output"> <CONDITION TYPE="FAILURE" REBIND="shirtPrice" /> ... <BINDING NAME="shirtPrice" TYPE="Output"> ...
Multiple bindings are useful in situations where the documents returned by a back-end program are dependent upon the input criteria that was submitted in the HTTP request. For example, a retail Web site may return a document with a different structure for an SKU depending on whether the item requested is a shirt, a tie, or trousers. The use of multiple bindings allows a condition to determine the appropriate binding for extracting the desired data. Must refer to a binding that is defined in the same WIDL interface.
Conditions can direct a service to initiate a service chain, in which case the name-value pairs of an output binding are passed into a second service. The name-value pairs must match the variable names in the input binding of the second service for the service-chain to succeed.
<Binding Name="productSearchOutput" Type="Output"> <Condition Type="Success" Service="ExtractPrices" /> ...
Service chains can be used with Web-based e-commerce systems when it is necessary to invoke multiple services in sequence to complete a purchase.
The Web Interface Definition Language (WIDL) is an application of the eXtensible Markup Language (XML); its definition consists of the various XML elements defined in this section.
The following sections define the elements of WIDL.
<WIDL> is the parent element for the Web Interface Definition Language; it defines an interface. Interfaces are groupings of related services and bindings. The following are attributes of the <WIDL> element:
Attribute | Description | Type | # | Default |
NAME | Establishes a name for an interface. The interface name is used in conjunction with a service name for naming or directory services. | String | Exactly One | |
VERSION | Identifies the version of WIDL. | String | 0 or 1 | "2.0" |
TEMPLATE | WIDL enables common interfaces to services provided by multiple vendors. A shipping template defines a functional interface for shipping services; various implementations can be provided for Federal Express, UPS, and DHL. | URI | 0 or 1 | |
BASEURL | BASEURL is similar to the <BASE HREF=""> statement in HTML. Some of the services within a given WIDL may be hosted from the same Base URL. If BASEURL is defined, the URL for various services can be defined relative to BASEURL. This feature is useful for replicated sites which can be addressed by changing only the BASEURL, instead of the URL for each service. | URI | 0 or 1 | |
OBJMODEL | Specifies an object model to be used for extracting data elements from HTML and XML documents. Object models are the result of parsing HTML or XML documents. The use of object models is central to the functionality of WIDL. Object References are used in <VARIABLE/>, <CONDITION/> and <REGION/> elements. | String | 0 or 1 | wmdom |
The <SERVICE/> element describes a request/response interaction with a Web server. Web servers use Get and Post methods to return documents and invoke CGI scripts and services via NSAPI, ISAPI, or other back-end Web server programs. Web servers typically take a set of input parameters, perform some processing, then return a static or dynamically generated HTML, XML or text document.
The attributes of the <SERVICE/> element map an abstract service name to an actual URL, specify the HTTP method to be used to access the service, and designate 'bindings' for input and output parameters.
Attribute | Description | Type | # | Default |
NAME | Establishes a name for a service. The service name is used in conjunction with an interface name for naming or directory services | String | Exactly One | |
URL | Specifies the Uniform Resource Locator (URL) for the target document. A service URL can be either a fully qualified URL or a partial URL that is relative to the BaseURL provided as an attribute of the <WIDL> element. | URI | Exactly One | |
METHOD | specifies the HTTP method ("Get" or "Post") to be used to access the service. | String | Exactly One | "Get" |
INPUT | Designates the <Binding> to be used to define the input parameters for programs that call the service. The specified name must be that of a <BINDING> contained within the same <WIDL> as the service. | String | 0 or 1 | |
OUTPUT | Designates the <Binding> to be used to define the output parameters for programs that call the service. The specified name must be that of a <BINDING> contained within the same <WIDL> as the service. | String | 0 or 1 | |
AUTHUSER | Establishes the Username to be submitted for HTTP Authentication. | String | 0 or 1 | |
AUTHPASS | Establishes the Password to be submitted for HTTP Authentication. | String | 0 or 1 | |
TIMEOUT | If the service does not complete within the specified number of seconds, it is tried again up to RETRIES times after which it fails. | String | 0 or 1 | |
RETRIES | Number of times to retry the service before failing. | String | 0 or 1 |
The <BINDING> element defines input and output variables for a service. Input bindings describe the data submitted via Get or Post to a Web server; for example, the input fields in an HTML form. Static HTML document do not require input variables. Output bindings describe which data elements are to be mapped from the output document and returned as a result of an HTTP request to a Web server with the given input variables. In most cases an output binding will map only a subset of the available data elements in the output document.
Attribute | Description | Type | # | Default |
NAME | Establishes a name for a binding. The name is used in <SERVICE/> definitions and in <CONDITION/> statements that initiate service chains. | String | Exactly One | |
TYPE | Specifies whether a binding defines input or output variables. | String | Exactly One | "Output" |
The <VARIABLE/> element is used to describe both input and output binding parameters. Different attributes are used depending on the type of parameter being described.
Attribute Name | Description | Type | # | Default |
NAME | Identifies both the program variable and the VARIABLE definition itself. | String | Exactly One | |
FORMNAME | BINDING TYPE="Input": Specifies the variable name to be submitted via Get or Post methods. Obscure back-end variables can be given names that are more meaningful in the context of the service described by WIDL. Used in conjunction with WIDL Templates, FORMNAME permits the mapping of a single variable name across multiple service implementations. |
String | 0 or 1 | |
TYPE | Specifies both the data type and dimension of the variable. | String | 0 or 1 | |
USAGE | The default usage of variables is for specification of input and output parameters. Variables can also be used internally within WIDL, as well as to pass header information in an HTTP request. For instance, using internal variables a portion of a service's URL or a pattern for matching within an object reference can be specified as a variable that is part of an input binding. | String | Exactly One | Default |
VALUE | Designates a value to be assigned to the variable in HTTP transactions. For input variables this has the effect of rendering the variable invisible to calling programs, i.e. the specified value is submitted to the Web server without requiring an input from calling programs. For output variables this has the effect of hard-coding the value returned when the service is invoked. | String | 0 or 1 | |
REFERENCE | BINDING TYPE="Output": Any valid object reference defined by the specified Object Model. Identifies a property (typically the value) of an HTML document object to be assigned to the associated program output variable. If the identified object is not present in the output document, or the specified property is not valid for the object, null is assigned into the program output variable. |
String | 0 or 1 | |
NULLOK | BINDING TYPE="Output": Overrides the implicit condition that all output variables return a non-null value. |
String | 0 or 1 | "False" |
The <CONDITION/> element is used in output bindings to specify success and failure conditions for the binding of data to be returned to calling programs. Conditions enable branching logic within service definitions; they are used to attempt alternate bindings when initial bindings fail and to initiate service chains, whereby the output variables from one service are passed as name-value pairs into the input bindings of a second service. Conditions also define error messages returned to calling programs when services fail.
Attribute Name | Description | Type | # | Default |
TYPE | Specifies whether a condition is checking for the 'Success' or the 'Failure'
of a binding attempt, or whether a binding attempt should be retried in the
case of a 'server busy' error.
Any variable that returns a NULL value will cause the entire binding to fail, unless the NULLOK attribute of that variable has been set to true. Conditions can catch the success or failure of either a specific object reference or of an entire binding. In the case where a condition initiates a service chain, it is important that all variables bind properly. |
String | 0 or 1 | "Success" |
REFERENCE | Specifies an object reference which extracts data from the HTML or XML
document returned as the result of a service invocation. The Reference attribute
for conditions is equivalent to the Reference attribute used in variable
definitions.
Identifies the document object property that will be compared with the text pattern specified by the Match attribute. |
String | Exactly One | |
MATCH | Specifies a text pattern that will be compared with the object property referenced by the Reference attribute. | String | Exactly One | |
REBIND | Specifies an alternate output binding. Typically a failure condition indicates that the document returned cannot be bound properly. Rebind redirects the binding attempt. The use of REBIND allows a conditions to determine the appropriate binding for extracting the desired data. Must refer to a binding that is defined in the same WIDL interface. | String | 0 or 1 | |
SERVICE | Specifies a service to invoke with the results of an output binding. Aside from the obvious benefit of chaining services to further automate the tasks that can be encapsulated for client programs, there are many cases when target documents can only be retrieved after visiting several Web pages in succession. In some instances cookies are issued by an entry page that must be visited prior to interacting with HTML forms; in others, URLs are dynamically generated from databases for specific user identities. | String | 0 or 1 | |
REASONREF | Reference to an object's attribute to be returned as an error message when a service fails. | String | 0 or 1 | |
REASONTEXT | The text to be returned as an error message when a service fails. | String | 0 or 1 | |
WAIT | Number of seconds to wait before re-trying retrieval of a document after a server has returned a 'service busy' error. | String | 0 or 1 | |
RETRIES | Number of times to retry the service before failing. | String | 0 or 1 |
The <REGION/> element is used in output bindings to define targeted sub-regions of a document. Regions permit the extraction of data elements relative to other features of a document. Regions are critical for poorly designed documents where it is otherwise impossible to differentiate between desired data elements (for instance story links on a news page) and elements that also match the search criteria.
Attribute | Description | Type | # | Default |
NAME | Specifies the name for a region. This name can then be used as the root of an object reference. | String | Exactly One | |
START | An object reference that determines the beginning of a region. | String | Exactly One | |
END | An object reference that determines the end of a region. | String | Exactly One |
Web technology is strong on interactivity, but low on automation. Electronic commerce on the Web is primarily driven manually via a browser. In order to achieve business-to-business integration organizations have resorted to proprietary protocols. The many-to-many nature of Web commerce demands a standard for automated integration.
This proposal defines the infrastructure necessary for Web resources to be described as functional interfaces that can be invoked directly from business applications written in languages such as Java, C/C++, COBOL, and Visual Basic. By capturing details such as input parameters, service URLs, and data extraction methods for output parameters, WIDL enables automation of interactions normally performed manually via a browser.
<!ELEMENT WIDL ( SERVICE | BINDING )* > <!ATTLIST WIDL NAME CDATA #IMPLIED VERSION (1.0 | 2.0 | ...) "2.0" TEMPLATE CDATA #IMPLIED BASEURL CDATA #IMPLIED OBJMODEL (wmdom | ...) "wmdom" > <!ELEMENT SERVICE EMPTY> <!ATTLIST SERVICE NAME CDATA #REQUIRED URL CDATA #REQUIRED METHOD (Get | Post) "Get" INPUT CDATA #IMPLIED OUTPUT CDATA #IMPLIED AUTHUSER CDATA #IMPLIED AUTHPASS CDATA #IMPLIED TIMEOUT CDATA #IMPLIED RETRIES CDATA #IMPLIED > <!ELEMENT BINDING ( VARIABLE | CONDITION | REGION )* > <!ATTLIST BINDING NAME CDATA #REQUIRED TYPE (Input | Output) "Output" > <!ELEMENT VARIABLE EMPTY> <!ATTLIST VARIABLE NAME CDATA #REQUIRED FORMNAME CDATA #IMPLIED TYPE (String | String[] | String[][]) "String" USAGE (Default | Header | Internal) "Function" REFERENCE CDATA #IMPLIED VALUE CDATA #IMPLIED MASK CDATA #IMPLIED NULLOK #BOOLEAN > <!ELEMENT CONDITION EMPTY> <!ATTLIST CONDITION TYPE (Success | Failure | Retry) "Success" REF CDATA #REQUIRED MATCH CDATA #REQUIRED REBIND CDATA #IMPLIED SERVICE CDATA #IMPLIED REASONREF CDATA #IMPLIED REASONTEXT CDATA #IMPLIED WAIT CDATA #IMPLIED RETRIES CDATA #IMPLIED > <!ELEMENT REGION EMPTY> <!ATTLIST REGION NAME CDATA #REQUIRED START CDATA #REQUIRED END CDATA #REQUIRED >