This paper is provided for consultation purposes only and does not constitute a standard or a commitment to support or promote such by the World Wide Web Consortium.
The interaction of HTTP and HTML to support online sale of goods is considered. An understanding of the interaction between these specifications being essential to provide security of the overall system.
This memo is principally concerned with the use of the World Wide Web (Web) to perform online sale of goods. The mechanisms described are also applicable for other uses including some which do not relate to payments.
Commercial use is one of the fastest growing areas of the Web. Such uses include advertising, sale of information and sale of goods. At present the most visible commercial use is advertising. Sale of information and goods requires transfer of monetary value which is poorly addressed in the current Web architecture.
A number of companies have deployed payments systems on the net including, First Virtual, CyberCash and MarketNet. In addition there are a number of proposals including Secure Courier, iKP Vishnu and e-Cash. A World Wide Web Consortium report Electronic Payment Schemes provides an overview of these schemes and others.
Although sale of information and goods both involve transfer of value there are significant differences. In a sale of goods the contract is not fulfilled until delivery of the physical goods occurs. In a sale of information the contract is satisfied by the network protocol itself. The potential losses of merchant and consumer also differ. Physical goods may only be sold once and the seller suffers an actual loss if payment is not made. The seller of information goods will in general suffer only an opportunity loss if payment is not made. Sale of information also differs from sales of goods in that the typical purchase price is lower and that the speed and usability of the user interface is a part of the product. Information goods differ form physical goods also in that it is not possible to return them if they are unsatisfactory. In the light of these differences the user interface requirements of the sale of information case are not addressed in this paper.
For the Web to support sale of goods secure mechanisms for making payments are essential. Use of a secure communication protocol in itself does not guarantee security however. Security is a property of complete systems, not simply of components. Where a protocol makes assumptions about the behavior of higher level user interface components it is important they are clearly understood.
The user interaction pattern for sale of goods has three stages, negotiation agreement and payment. In the negotiation stage the terms of the contract are determined. In the agreement stage the contract is either rejected or accepted and thus becomes binding. The payment stage achieves the movement of value.
It is generally desirable for agreement and payment to be atomic so that the offer of payment constitutes acceptance of the contract. In some cases however payment may be dealt with separately. This is often the case with commercial orders where payment is not made until after delivery or is dealt with by a separate accounting department.
In some cases payment may occur without a negotiation or agreement phase. In these cases the payments mechanism is used purely for transfer of funds. Charitable and political donations fall into this category. In other cases payment may be of a semi-voluntary nature. For example an advertiser may pay a publisher according to the proportion of trade a link generates. Since the publisher can decide which links to advertise on the basis of amount paid is in the interests of the advertiser to be honest. In such circumstances mechanisms to ensure binding of movement of value to a particular contract may be unnecessary.
In the negotiation phase the customer and merchant decide upon the goods to be purchased and the terms of the sales contract. In most cases be the customer will specify the goods to be purchased and the merchant the contract terms. For there to be a contract both customer and merchant must agree to both parts of the agreement.
What constitutes an agreement may differ according to the relevant jurisdiction. Representations made prior to presentation of the offer document may in some circumstances form a part of the contract despite clauses to the contrary in the offer document. In addition consumer protection legislation in many countries may affect what terms and conditions are binding. Where sales take place across national borders the question of deciding the jurisdiction in which disputes concerning the contract may be settled may be a complex one.
The negotiation model may be informal, for example selection of goods from a catalogue or formal, for example a bidding process in which the goods are supplied to the highest bidder. The problem of formal bidding structures is an important one but is not addressed in this memo.
The need to protect customer confidentiality in the negotiation phase must be carefully considered. In addition to possible statutory data protection requirements companies must act to protect their reputation.
The negotiation may involve highly specialized requirements. It is thus inappropriate to attempt standardization in the near future although it is likely that agreement on a number of broad principles will emerge. The customers needs may be more complex than may be expressed in a simple fill in form. For example a customer ordering a camera and a lens might want to purchase the camera should the lens turn out not to be in stock but not want a lens without the camera. In mail and telephone ordering the expression of such requirements is easier since processing of orders involves a human element. Designers of Web interfaces should consider such needs.
In the agreement phase the merchant provides an offer of goods which the customer either accepts or rejects. If the offer involves exchange of a valuable consideration and the customer accepts a contract has been formed. The following concerns issues must be addressed therefore:
The Web does not specify a unique and unambiguous form for the presentation of a particular HTML document. The specification defines only the document structure not its presentation. This permits the rendering of a document in an arbitrary medium including non-visual mediums such as speech. The question of determining the terms of an offer is therefore more complex than a simple requirement for a non-repudiable digital signature.
In order to enhance reliability Web clients are required to be tolerant of incorrectly formatted documents. There is however no standard which defines how incorrect HTML should be presented however. Different clients may present the same document differently, in some cases text which is present in one document may be absent in another. Such ambiguities provide at best the potential for costly legal argument and at worst opportunities for merchant fraud. Similar problems arise when non standard extensions are employed.
In order to prevent unnecessary ambiguities it is recommended that all offer documents be required to adhere to a published standard. Thus if the content type text/html is specified documents should conform to the current HTML DTD.
In determining whether the contract was accepted the question of whether the user action may should be interpreted as an intention to accept the contract arises. The Web permits user interfaces of substantial complexity and visual impact. A payments interface must be carefully designed to prevent the exploitation of the rich user interaction capabilities of many Web browsers to be used to mislead the customer. The payments mechanism must ensure that actions connected with payments are unambiguously signaled as such to the user.
Care must be taken that ambiguity is not introduced through the overloading of existing user interface idioms. The password authentication mechanism and the url encoding for example.
A significant number of Web clients perform automated document retrieval. These include fully automated ``Web walkers'' such as Web Walker and power-browsers which perform automated retrieval.
The question of whether a valuable consideration was exchanged lies outside the scope of this memo.
The payments stage is concerned with the actual transfer of funds. This stage may involve the provision of certain information which is outside the scope of the contract itself such as account details and authentication information.
In general it is desirable that the act of acceptance and initiation of payment be atomic. This may be considered an implicit term of the offer that the contract is not binding until an offer of payment is made. The legal status of orders where the offer of payment is refused or the payment is dishonored must be carefully considered.
The user interface must be secure, convenient and extensible. Security requirements include the need to protect authorization information from unauthorized access, the maintenance of customer confidentiality and tracking of payments in progress. The problem of protecting of each party from fraud must also be addressed. The user interface should be efficient in both speed and use.
Users have very exacting expectations of any system which involves money. It is essential that such systems provide the user with confidence in their design and implementation. Otherwise a system is likely to encounter overwhelming consumer resistance and fail to gain acceptance.
Users must also be confident that they can install configure and use the interface without making unintended purchases. This is especially important on the Internet where users have traditionally tolerated less than perfect security and reliability. A clear and complete description of the payments mechanism may be useful in building user confidence. Complex installation procedures should be avoided.
Third party financial institutions may play a critical part in the building of customer confidence. In particular the ability of a third party to resolve disputes. Availability of information to allow resolution of disputes is therefore important.
The payments interface should attempt where practical to protect the user from unintended purchases. Such purchases might be caused by user error, miss-keying or through intentional deception on the part of the merchant. The latter problem is considered in the section on fraud below.
The elimination of user error is a difficult problem which has not found a solution. In general the best which may be achieved is to minimize the number of severe errors by signaling potentially dangerous operators distinctively. Warning boxes and other signals may be effective to this end provided that they are not over used so that user affirmation by the user becomes automatic.
The user interface must be carefully implemented to prevent unintended payments due to software failure. For example graphical user interface software must ensure that user interaction intended for one window is not misdirected to another. This problem is seen in many commercial applications where a double click on one page may cause activation of a image area on the succeeding page.
The user interface should be efficient in terms of both speed and user interaction. Unnecessary user interactions should be avoided. Unnecessary communications introduce both delay and additional potential failures. Where substantial processing is required the purpose of that processing should be displayed. Wherever possible processing should not halt the user interface.
In order for a payments application as a whole to be secure authorization information such as credit card numbers for which protection is required in transit must be equally well protected by the application itself. Authorization information must only be solicited through a user interface which is clearly distinguished from other network related interaction.
On systems without secure memory management care must be taken that sensitive information is overwritten before memory is released for reuse. This consideration is particularly important in environments where plug in modules with access to a single data space may have different degrees of trust. Personal computers with virtual memory or automatic startup features pose particular problems. In many cases the operating system makes no provision for protection of memory stored to disk.
Customer confidentiality must be maintained. In many jurisdictions the use of personal data is strictly controlled by statue. In addition the tightly coupled communication structure of the Internet raises the level of ethical standards expected of companies using it for commerce.
The user must be aware of all information which is to be communicated to the merchant prior to acceptance of the contract becoming effective.
In some cases a transaction may not be completed without the user being informed as to whether the payment failed or succeeded. In such cases a mechanism for discovering the status of a partially completed is essential.
Incomplete transactions should be recorded on a non volatile media during processing in case of hardware or software failure. For merchant and financial institute server software this presents a well understood and supported technical requirement easily solved through careful hardware and software choice.
This task is considerably more difficult in the consumer market where there is almost no control over hardware choice and little over software. Although use of a payments application module may be stipulated it is not possible to mandate the use of a particular operating system. In some cases it may not be possible to determine whether a file is located on a physical disk or in memory. Provision for separate configuration of volatile and non-volatile storage areas may assist in this regard.
Although cryptographic payments protocols generally concentrate on the problem of customer fraud all the parties in a transaction must be considered as a potential source of fraud. Banks and merchants cannot guarantee absolutely the honesty of their employees. The merchants themselves and even banks may prove dishonest.
In one type of customer fraud a credit card number and corresponding billing address are obtained and verified. A telephone order is then placed to be delivered to the billing address. The delivery of goods to the billing address is often used as a security precaution. The shipping number of the goods is obtained and used to request redirection of the goods to an alternative address. An article written under the pseudonym VaxBuster describes a fraud of this type.
The potential loss due to misuse of an individual customer account is limited by the balance or credit limit and is therefore relatively small. Merchant fraud may involve many customer accounts and hence considerably greater sums.
In one form of merchant fraud a business trades legitimately for a period of time. Once a credit rating has been established and a line of credit established with various institutions a large quantity of goods are billed. Once payment is made the merchant disappears before it is discovered that the goods sold did not exist.
The payments protocol itself can do little to prevent this type of fraud since to all outward appearances the payments appear entirely legitimate to both customer and acquirer until it is discovered that the goods will not arrive. The sooner a customer is able to discover that billing has occurred without dispatch of the goods the smaller the time window for a sucessfull fraud becomes. Provision of up to date statements of account may serve to narrow this window. In addition merchants might be required to provide the customer with some proof of dispatch.
In another form of merchant fraud the wrong goods are delivered intentionally. This may be a simple misrepresentation of the goods or a more subtle deceit such as the quotation of a low price for parts of a composite order while only intending to supply the high profit margin parts of the order. The provision of a non-repudiable and fully descriptive description of goods is therefore essential.
A number of proposed payments protocols incorporate a signature of the offer document so that it cannot be repudiated. Before such schemes may be implemented the precise form of the offer document must be defined.
It is also necessary to consider which party has responsibility for storing the text of the document itself. It is in both parties interests to store at least the relevant digital signatures. It is only strictly necessary for the document to be retained by one party but it is likely that both will require access. Both merchant and customer implementations must therefore have provision for archiving this information.
An electronic contract differs from a paper contract in that a paper document has only a single presentation embodied in its physical form whereas a digital document may have multiple presentations. A potential ambiguity arises as to what has been agreed since a digital signature relates only to the representation of the document in question, and not the presentation. This ambiguity may be reduced by requiring the offer document to adhere to a well defined standard.
One approach to this problem is the use of a very narrow document format with a single defined presentation. Such a format would have to address a number of complex requirements such as internationalization. It is unlikely that this approach could adequately address accessibility for special needs communities such as speech synthesis for blind users. In some cases tabular listing of purchased items or even picture of the goods may be highly desirable. The use of a document format with a broad scope such as HTML is therefore desirable.
Dependable presentation of the offer document requires that it conform to a well defined standard. Clients differ widely in their presentation of invalid documents. This behavior frequently varies between different releases of the same product. Invalid documents are also more likely to expose errors in the client software.
A number of extensions to the HTML standard are currently proposed. These include the ability to determine the size, color and font of text and to provide background images. Although these mechanisms allow the creation of documents with high visual impact they also provide potential for abuse. Color controls in particular are a cause for concern. Their use may create text which is unreadable such as black ink on a black background. Such effects may arise unintentionally, through unintended interaction with user preferences or hardware incompatibilities such as use of a monochrome display. It is therefore recommended that offer documents be required to conform to the HTML/2.0 DTD.
Where images are embedded within text to produce a composite document it is the composite document which constitutes the offer and hence it is the composite document which must be signed. A proposal for creating such signatures is in development. It is recommended that embedded images not be used until such a standard is agreed.
One of the main reasons for the success of the Web is its ability to interoperate across a wide variety of platforms, media types and networks. To maintain this ability extensions must provide functionality which is applicable to a wide variety of needs rather than provide a specific solution to a single problem. Otherwise the diversity of needs which the Web must address would continually expand the complexity of the specifications increasing the complexity and cost of maintainance of implementations.
The development of online payment services is a fast developing area. In addition to currently deployed schemes based on credit cards, there are many cash and cheque based schemes in advanced development or trial phases. Changes in implementation technology should also be anticipated, the use of smartcards for authentication for example. In addition there will be an ongoing need to ensure that software implementations are up to date.
A modular design in which software concerned with the payments protocol may be updated independently of the browser is therefore highly desirable. Depending on the platform concerned it may be appropriate for a payments module to provide its own user interface or for this to be provided separately.
Where a payment is mediated through a third party this party may have a liability with respect to the correct functioning of the user interface. Such a party may find it essential to fully control the presentation of the offer document.
The Web architecture must specify the means by which a client is informed when a payments mechanism should be invoked and what data to provide. This problem arises regardless of whether the payments mechanism is implemented as an internal or external module.
Where an interface is provided which permits a function to be supported as an external module it is possible to incorporate the same function as an internal module. The converse does not hold however. The architecture therefore concentrates on the case where a module is external. This does not preclude internal support for the function. The requirement for extensibility should be addressed however. It is highly desirable to support the external interface even where there is internal support.
If the offer document is to be comprehensive the use of HTML markup is prefered. Since presentation of HTML markup makes up a significant proportion of the implementation effort of a browser client. it would be preferable to leave this task to the browser to avoid the payments module becoming one. Unfortunately this approach may not address the question of liability in the case of software failure.
The markup of the offer document must provide a means of invoking the payments module and communicating to it the price, currency and other relevant data. It is also desirable that the payments module be able to take advantage of the clients network interface. This avoids the need for the user to configure the network interface of the payments module separately.
The security of the payments process is the primary responsibility of the payments module. It is therefore desirable that this module take responsibility for ensuring the confidentiality of authentication information.
This architecture requires both browser and payments module to be trustworthy. A compromise of the client module constitutes a significant security risk in any case however.
One means of incorporating a new mechanism into the Web is to specify a new URL. This would imply a new transport protocol rather than a new data format. There are currently more than 50 proposals related to online payments. Assigning a URL to each proposal would increase the number of specifications five fold. Protocols interfaced in this manner would could not take advantage of the evolution of HTTP. This would also require implementations to provide their own network code.
The content-type interface is widely supported by browser clients. It is used by most currently deployed payments mechanisms to activate a helper application which provides the user interface and implements the payment protocol. This mechanism has the advantage of modularity, payments schemes do not require client support and software updates may be made without the need to replace the browser. It is also widely supported by the existing client software base. This mechanism was not designed to support payments however and it has several defects.
The key defect in the content-type interface is that communication with the client and the helper application is made via a remote server. The user activates a link which causes a message to be sent to the remote server, the server then returns a message whose content type causes the client to activate the helper application.
The indirect communication path introduces unnecessary complexity and prevents offline use. the client can only be instructed to activate the payments module while connected to the network. In particular the client cannot at present communicate any information about the source document to the helper application. This prevents the information presented by the client from forming part of a non-repudiable contract. This complicates the user interface from the point of view of both the user and implementor. The contract must effectively be presented to the user twice, first by the client, then by the payments application. The question of which document constitutes the offer may therefore arise.
The indirect communication path is also in part responsible for the loss of information concerning the context in which the module was invoked. There is no direct connection between the text of the offer document to which the customer may believe agreement is made and the data sent to the payments module. If the payments module is to provide a non-repudiable signature of the offer document it must present it itself.
Implementations of the content type mechanism generally provide no more than a uni-directional communication. The client activates the helper application and passes it data. There is no provision for the helper application to return information to the browser. In part the NCSA CCI proposal attempts to address this need, this is a control rather than a communication interface however. It is not appropriate for the client to control the payments module or vice versa.
Installation of helper applications is also unsatisfactory. Each helper application must be registered as a viewer for a particular content type. Some users may find this a complex task. A bi-directional interface would allow the client to query the payments module to determine its capabilities.
Additional information could be passed from the client to the helper application through a mechanism such as UNIX environment variables. This has the disadvantage of being highly platform specific and still fails to provide a path for return of information from the client to the helper application.
The requirement that an offer document conform to an accepted standard means that a solution dependent on extensions to HTML is unsatisfactory. The options for providing an alternative to the content-type helper application interface are therefore limited. The <A> anchor tag provides no attribute suitable for this purpose, the ENCTYPE attribute of the <FORM> tag provides a suitable base however. A payments mechanism may be regarded as a specialized encoding for forms data. In addition the ENCTYPE interface would be very widely applicable, permitting client side forms validation for example. Such a mechanism would require extension of currently implemented clients however.
The encoding type helper application interface uses the ENCTYPE attribute of an HTML <FORM> element to activate the helper application. The ENCTYPE attribute specifies the encoding of the data for transmission. This fits well with the concept of a payments scheme for online sales which is effectively a highly specialized encoding scheme for a particular class of form. A client may opt to implement an encoding format either internally or externally in the same manner that content types are generally handled.
Communication between the client and helper application requires a bidirectional transport similar HTTP/1.1 (ie http with persistent connections) or HTTP-NG. In most cases however it would be more appropriate to use a process to process communication mechanism rather than TCP/IP. The appropriate mechanism will be dependent on platform for example streams under UNIX , mailboxes on VMS , OLE for Windows, AppleScript events etc.
Such an extension interface would also be more widely applicable. For example to allow interface of form validation modules into the client.
The following piece of HTML declares a form to be submitted using the application/pay-money encoding type. This is a hypothetical payments scheme which involves the following information being communicated to the client:-
Parameters to be communicated to the payments helper application are encoded in a HTML form. The labels for the form items are defined by the payment protocol. Sensitive information (in this case the account number) is not included in the form since it is not distinguished as secure.
<form action="http://payco.com/order/1452734" method=pay enctype=application/pay-money> <input type=hidden name=amount value="100.34"> <input type=hidden name=currency value="CHF"> <textarea name=memo> <input type=submit> </form>
The sequence of
Since the ENCTYPE interface represents an extension to the client capability it is necessary for the server to be aware that the browser supports it. This issue is considered in detail in the World Wide Web Consortium technical memo ``HTTP/1.2 Modular Extension Mechanism''.
The author is gratefull for the assistance of the W3C team in preparing this report and in addition numerous helpfull comments by Nathaniel Borenstein of First Virtual on an early draft.