Transactional Conversations

Svend Frolund and Kannan Govindarajan

Hewlett-Packard Company

email: svend_frolund@hp.com, kannan_govindarajan@hp.com

Introduction

We believe that web services should make their transactional properties available to other web services in an explicit and standard way. Transactional properties should be part of a service's interface rather than a hidden aspect of its backend. The transactional behavior of a service can then be exploited by other services to simplify their error-handling logic and to make entire business-to-business interactions transactional. However, such business-to-business transactions are challenging to implement because they span multiple companies and because the underlying transaction protocols execute over wide-area networks.

It is fundamental for web services to communicate through conversations. A conversation is a potentially long-running sequence of interactions (document exchanges) between multiple web services. For example, a manufacturer may engage in a conversation with a supplier and a conversation with a shipper to carry out the activity of purchasing parts. In many situations, the backend logic triggered as part of these conversations may be transactional. For example, it may be possible to arrange for parts to be shipped, and then later cancel the shipment (if the parts have not actually been sent yet). Cancelling the shipment is an example of a compensating transaction, it compensates for the initial transaction that arranged the shipment. Since the notion of conversation is fundamental to web services, the exportation of transactional properties should fit with conversations, giving rise to transactional conversations.

In the Internet setting, atomicity is the most important aspect of transactions. Atomicity means that the effect of a transaction either happens entirely or not at all. We also refer to this as all-or-nothing semantics. If a service A knows that a conversation with another service B is atomic, then A can cancel the conversation and know that B will cleanly revert back to a consistent state. Furthermore, A can rely on the B's transactional behavior to ensure state consistency in the presence of failures, such as logical error conditions (e.g., shipping is impossible) or system-level failures (e.g., the crash of a process or machine).

Services should expose their transactional behavior in a manner that facilitates composition of transactions from different services. For example, it should be possible for the manufacturer to compose a transactional conversation with the supplier and a transactional conversation with a shipper into a transactional activity that includes both conversations. The advantage of creating these multi-conversation transactions is that the manufacturer gets all-or-nothing semantics for the entire end-to-end purchasing activity: if shipping is not available, the order placement is cancelled. This is a very powerful notion, that we believe will significantly reduce the complexity of programming business-to-business activities between multiple web services. Composite transactions provide a notion of "clean abort" for entire business-to-business activities. Moreover, having a standard notion of transaction allows us to build generic software components that perform the transaction composition.

Atomicity

We discuss different ways for transactions to be atomic. As terminology, we introduce the notion of a transaction outcome, which is either commit or abort. The outcome is abort if the effect of the transaction is "nothing." The outcome is commit if the effect is "all."

Two-Phase Commit and Compensation

If we execute two atomic transactions, their combined effect is not necessarily atomic: one transaction may abort and the other may commit, which means that the combined effect is neither all nor nothing. If we create a composite transaction from two constituent transactions, we need to ensure that either both constituent transactions commit or that both constituent transactions abort. There are two traditional ways to ensure this. One way, called two-phase commit, is based on the idea that no constituent transaction is allowed to commit unless they are all able to commit. Another way, called compensation, is based on the idea that a constituent transaction is always allowed to commit, but its effect can be cancelled after it has committed.

With two-phase commit, transactions provide a control interface that allows a transaction coordinator to ensure atomicity. One incarnation of the control interface is the XA specification [3]. Essentially, the control interface provides a prepare method, an abort method, and a commit method. The coordinator calls the prepare method to determine if a transaction is able to commit. If the transaction answers "yes," then the transaction must be able to commit even if failures occur. That is, the transaction is not allowed to later change its mind. If all transactions answer "yes," the coordinator calls their commit method, otherwise the coordinator calls their abort method.

With compensation, there is no additional control interface. Instead each "real" transaction has an associated compensating transaction. With compensation, a coordinator can ensure atomicity for a number of constituent transactions by executing the transactions, and if any transaction aborts, the coordinator executes the compensating transaction for all the transactions that have committed.

Discussion

Although both two-phase commit and compensation can provide atomicity for composite transactions there are trade-offs between the two methods. Compensation is optimistic in the sense that the coordinator only enters the picture if something bad (e.g, abort) happens. With two-phase commit, the coordinator enters the picture even if all transactions commit. The coordinator always calls prepare and either commit or abort for any transaction. On the other hand, two-phase commit always provides a point after which a service can forget about a transaction. Once the transaction is instructed to commit, the service can forget about the transaction. With compensation the service has to be able to compensate forever. The ability to compensate may or may not require the service to maintain persistent state. Of course, there are hybrid models where compensation is bounded by time or the occurrence of events (such as receiving a notification).

In practice, few systems use two-phase commit in the Internet context. One reason is that, with two-phase commit, a service exposes transaction control to other services. If a service answers "yes" in response to a prepare request, the service has to be able to commit the transaction until instructed otherwise by the coordinator (which may be another service). Few services are willing to export such transaction control in a loosely-coupled system. Another reason for the limited use of two-phase commit is that composite transactions may be long running. If we want transactions to span entire business-to-business activities, we have to accept the possibility that transactions may run for a long time. With two-phase commit, a constituent transaction cannot commit until the composite transaction can commit. Thus, a fast service may be forced to wait for a slow service.

If we want to support two-phase commit, we need a protocol that allows flexible designation of the coordinator role. For example, a given web service may be willing to play the role of participant in certain situations, but may insist on playing the role of coordinator in other situations. If we have a conversation definition language, such as CDL [1], we can capture this distinction through different conversations. A service can export a coordinator version and a participant version of the same logical conversation.

We believe compensation is a fundamental notion of atomicity for web service, and in the remainder of this paper, we shall only consider compensation. This does not reflect a position against two-phase commit, but is merely to simplify the discussion.

Isolation, Durability, and Consistency

Traditional database transactions satisfy the ACID properties (atomicity, consistency, isolation, and durability) [2]. Unlike traditional database transactions, we do not believe that transactions, in the context of conversations, should necessarily provide isolation. Isolation is concerned with controlling the concurrent execution of transactions to provide the illusion that concurrent transactions execute one after the other in some (indeterminate) serial order. Isolation is unnecessarily strict for Internet transactions. This is evident from many Internet sites that provide transactions without isolation. For example, sites, such as Amazon.com, provide transactional semantics in the form of compensation (cancelling an order within a given time limit) and do not provide isolation. Besides being unnecessarily strict in many cases, isolation is also costly because transactions may be long running, and providing isolation for long-running transactions hampers the overall performance.

To continue the comparison with database transactions, we would expect the "primitive," constituent transactions to provide durability and consistency. Durability means that transactional updates are made persistent if the transaction completes successfully. Consistency means that a transaction takes the system from one consistent state to another. The durability and consistency of constituent transactions follows from the transactional properties of the backend logic in web services. We do not believe that "composite," multi-conversation transactions should provide any global notions of durability or consistency beyond what the constituent primitive transactions provide. In other words, we do not rely on any particular notion of durability or consistency when we compose primitive transactions into composite transactions.

Describing Transactional Properties

We outline briefly what it may take to describe, and thus export, transactional properties of web services. The starting point for our discussion is the assumption that services communicate through explicit conversations. If a service exports a description of its conversations--the conversations it is willing to engage in--the question is how the service can specify the transactional properties of those conversations. The specification makes explicit to other services how the service is transactional. The specification should communicate the following aspects of transactions:

Demarcation . We need to describe which parts of a conversation are transactional. If we consider a conversation as a sequence of interactions, we need to identify the transactional sub-sequences of those interactions. At one extreme, the entire conversation may be transactional. But other possibilities may exist as well. For example, only a single interaction may be transactional, or an identified sub-sequence may be transactional. In general, a single conversation may have multiple transactional parts.
Outcome. To exploit the transactional behavior of a service, we need to know the outcome (commit or abort) of its transactions. One way to communicate the outcome of a transactional conversation to other services is to associate a particular outcome with a particular point in the conversation. For example, a specific interaction may denote abort and another interaction may denote commit. If the conversation reaches an interaction that indicates abort, the parties of the conversation know that the outcome is abort. We need to describe which interactions indicate abort and which interactions indicate commit. Notice that we can also use document types instead of interactions to indicate the outcome of transactions.
Compensation. We need to describe how transactions can be cancelled or compensated for. For example, a conversation may have a particular document that triggers compensation, or different documents may trigger compensation at different points in the conversation. To initiate compensation at a given point in a conversation, sending a compensation document must be a legal interaction at that point in the conversation, and we must be able to generate the appropriate compensation document. Notice that compensation may not be possible at any point during a transactional conversation, so we need to know both how and when compensation is possible.

If a service exports a description of its conversations in the form of an XML document, we can think of the description of the transactional properties as a companion document.

Requirements

To conclude, we outline basic requirements for web service transactions.

We want our notion of web service transaction to fit with conversations. Conversations provide the context for transactions: transactions take place within conversations, and we talk about transactional conversations. The integration of conversations and transactions have consequences for the transaction model. Because conversations can be long-running, so can transactions. The transaction protocols, such as two-phase commit and compensation, involve communication between web services. These communications should be first-class members of the conversations between web services. For example, if we have a conversation definition language to describe conversations, we should use that language to describe the transaction protocols as well.

We want to support compensation as part of the transaction model. With two-phase commit, transactional web services rely on an external entity, a transaction coordinator, to communicate the transaction outcome to them. Such reliance on external entities may not always be appropriate in loosely-coupled systems. Compensation does not introduce the same level of reliance on external entities. Our position is not against two-phase commit, but rather in favor of compensation: two-phase commit protocols may be appropriate in certain situations. If we have a transaction model that supports both two-phase commit and compensation, we have to address the issue of "mixed-mode" transactions--transactions whose constituent transactions are based partly on two-phase commit and partly on compensation.

In general, regardless of the choice of transaction model, we want to support a decentralized, peer-to-peer model for transactions. For example, we do not want to assume the existence of a centralized transaction coordinator. We do not want to prevent a centralized notion of coordinator, we simply do not want to rely on one. Notice that the notion of a transaction coordinator may be relevant for both two-phase commit and compensation. A central coordinator might make sense in conjunction with compensation. This coordinator would then gather the outcomes of the various constituent transactions and execute compensation transactions as necessary.

We need to address the issue of trust between the web services that participate in a transaction. Both two-phase commit and compensation assumes that the various parties are well-behaved (or trusted). For example, two-phase commit assumes that participants vote "honestly" and that they do as instructed (commit or abort). Furthermore, the notion of compensation also assumes that a participant actually executes a compensating action if instructed to do so. With two-phase commit, each participant also trusts the coordinator to be in control of the protocol--the protocol is inherently asymmetric because the coordinator knows the outcome before any of the participants. Since trust is a general issue for web services, we assume that some other mechanisms are put in place to deal with trust in a general sense. In terms of transactions, we need to integrate with those general mechanisms to handle trust. It is unlikely that we can treat trust as a completely orthogonal issue to transactions.

References

[1] K. Govindarajan, A. Karp, H. Kuno, D. Beringer, and A. Banerji, "Conversation Definitions: defining interfaces of web services," submitted to the 2001 W3C workshop on web services.

[2] J. Gray and A. Reuter, "Transaction Processing: concepts and techniques," Morgan Kaufman Publishers, 1993.

[3] Distributed Transaction Processing: The XA Specification, X/Open Snapshot, 1991.