WSTF Internationalization Usage Scenario:
Aggregation Pattern

Author

Addison P. Phillips [webMethods]

Status of This Document

This document is for reference purposes only and has no official standing or status.

Version

0.3 [2003-05-07]

Introduction

Aggregation is a Web Services message exchange pattern (MEP) most closely approximating the WS-Architecture Usage Scenario [1] S030, Third-Party Intermediary [1a]. In this pattern one Web service provides an input to a Web service based application that separately interacts with another set of clients.

Canonical Example (Online Auction)

Figure1: Third Party Intermediary

The example given in S030 is an online marketplace or auction, in which the "client" offers a contract or product for bid and the separate auction service manages the process of matching high (or low) bidders to the "opportunity". This is a common kind of marketplace transaction. The key here is that there are two Web services with an application between them. The implementation of an actual application is not discussed in detail. In fact, except to note that different SOAP request/response relationships may be established as best suits the needs of the implementation, the text was carefully and considerately written. As a result, this scenario contains no internationalization considerations that need to be addressed in the text. The comments which follow are all directed towards implementers who may be considering this usage patter.

Management Example

Another, slightly imperfect, example of this scenario is distributed management. In this scenario there is generally a "management server" that receives "notifications" of management events in its domain. A client (such as a console program) may register interest in (that is, subscribe to) specific notifications or classes of notifications and the management server will then package and forward the requested items as it receives them. It may also forward cached state or notifications on an as needed basis.

As with the "auction" example, there are two Web services that interact via an intervening application. There is a subtle difference in the management scenario that makes it a little simpler: the notifications produced are unidirectional and no client information is needed at the managed object or system, whereas with the auction scenario, the details of the "auction" must be passed to the other service's client.

OMI Aggregation Pattern

One existing Implementation of Web service based systems management is OMI [2]. In this implementation (shown in the figure), the host receives systems management events from managed objects and stores them. Clients of this host register or subscribe to specific events that they are interested in receiving. After the registration, the host forwards any events it receives to the subscribed clients.

Internationalization Issues

The internationalization issues in this pattern fall into seven categories:

  1. Negotiation of natural language preference.
  2. Negotiation of international preferences.
  3. Multiple language resolution of data.
  4. Passing or Matching language preferences between intermediary clients.
  5. Natural language handling in faults.
  6. Ordering, grouping, and collation of notifications.
  7. Service Descriptions.

These issues are common to Web Services and can be addressed in some cases via application design.

In-Depth: Auction Example

In the auction example, the Web services client that starts the interaction will submit data that starts the auction process. If there is human readable (natural language) data in this structure, it is generally targetted to a single market and is probably in a single language.

If the service is well-designed, the language of this data is tagged with xml:lang or the structure of the data indicates the language or market that it applies to through some sort of inferential mechanism. This value may then be used to influence which clients of the bidding portion of the process actually see the data or how they see it.

For example, if users must register to "post" an offer, their user registration may contain the information necessary to correlate the offer with potential bidders (and to eliminate others on the same basis) and the use of xml:lang to tag specific data elements may therefore be redundant.

In other circumstances (in which, perhaps, procurement or contract offerings are offered globally) the language of the offer may be important to the ultimate user of the data (in this case the bidder) or to the intermediary (for example, the intermediary may need to parse the description linguistically in order to keyword-index it, or the intermediary may filter offers in some logical fashion that depends on the language).

In-Depth: Systems Management Example

The client (consumer) of systems management notifications is generally a human-interactive process, such as a management console, which attempts to show the status of a managed object. The status value itself is generally an enumerated value ("up", "down", "starting", "stopping", and few more). These can be localized by the client into a particular language. In fact, these kinds of items are often are rendered graphically--a green light for a system that is "up" for example.

More complex interactions generally require more complex information. For example, when a system emits a notification about an error, the range of errors is generally larger than can be enumerated at the client in advance. These values are generally resolved to a string or message. The language of the message delivered to the client should be in the natural language of the client (so that the administrator can read it) and the range of languages delivered should not be broad, even if the managed objects are widely distributed in the network.

One way to achieve this effect is to run all systems in the same language configuration, but this is an anachronism to days when software was not capable of multi-lingual execution.

Natural Language Negotiation

"Natural Language" refers to human use of language. Generally systems that are internationalized can produce messages in a variety of different natural languages. These systems are referred to as "localized", because their messages (and frequently behavior) are tailored to the individual cultural expectations for a specific target market or group of individuals.

Distributed processing, as with Web services, must allow for several patterns of behavior in the back end implementation represented by the service.

There are four patterns that such a service may follow. These are:

In each of these patterns, the service description (commonly WSDL) and actual protocol or invocation (SOAP is presumed here) should reflect the requirements of the service's own pattern of behavior.

International Preference Negotiation

International preferences are similar to the one described in the "Natural Language" section above. Some of the other preferences that a service might be interested in include:

Some of these preferences may be infered from the natural language by converting the natural language preference to the host machine's "locale". Other items (such as timezone) are orthogonal or (like collation) imperfectly or incompletely described by a natural language identifier. Separating these values requires forethought in the design of the service and the setting of reasonable default values.

In the sections that follow, you will see the word "locale" used as an adjunct to natural language. A locale is not a language and the language tags discussed in the succeeding sections should not be confused with actually being locale tags or identifiers. However, there is commonly a close relationship between the language identified by such a tag and the corresponding locale in the underlying platform and a software process may choose to use language tags to select many of these additional operational settings or international preferences.

Service Design in General

WSDL [3]

Web service descriptions should consider how to communicate language or locale-of-operation choices in a consistent manner. In the sections that follow, specific patterns are recommended as good canonical references. However experience shows that a specific implementation may require additional contextual information not conveyed with a simple language tag. Generally this type of additional information should be encoded into the data structure defined for actual interchange in the message body (such as a soap:body block), rather than as additional header information as shown in some of the examples below. This is because specific implementation decisions should be expressed as part of the service's signature: you may require additional or different data in future versions.

In the examples below, adoption of a generic method for exchanging "international contextual information" will allow implementations to better model the natural language and locale processing choices offered by the services.

In all cases, the implementer should consider adding a language tag to any operation fault elements to show what language to expect fault messages to be generated in.

In all cases, descriptive text should be tagged with its actual content language using the xml:lang attribute (where permitted). Consideration should be given to providing documentation within services in alternate languages when the service is expected to be utilized by users such as those in other countries or who speak other languages.

SOAP

In general, SOAP documents should structure data elements in ways that make the most sense for the specific underlying implementation. In the examples given, the user's natural language is passed in an optional header element separate from the specific data structures required to operate the underlying service logic.

This is by design.

Software developers generally get their language resources (translated messages and other locale-specific data) from their programming environment. This functionality is implemented in many ways, but the pattern for writing the logic is always similar: the language and locale preferences are not included in the parameter list of the service itself because the processing environment (JVM, OS, .NET framework, etc.) maintains this information. SOAP Processor implementations should be designed to recognize natural language information passed in the transport (such as HTTP Accept-Language[5]) or in SOAP headers as defined in this document or in the specific implementation-dependent extension of this model and populate or set the appropriate values in the service's environment.

For example, a .NET SOAP Processor might set the service's thread default CultureInfo using the language tag. A J2EE implementation might populate the ServletRequest Locale property with a java.util.Locale constructed from the ISO639 and ISO3166 fields embedded in a language tag. And so forth.

Faults

Fault message "text" elements must be labelled with an appropriate language identifier, as defined in XML 1.0. That is, an xml:lang tag containing an RFC3066 (or its successor) language identifier. If the transport provides the user's language preference (such as HTTP Accept-Language), then that language or set of languages should be preferred, followed by the SOAP Processor machine's local language preference.

Ideally there should always be a "message of last resort" included in the fault. In many cases this message may be in English, but consideration should be given to the likely users of the system, including the administrators trying to puzzle out the error. Numeric (or ASCII-only alpha-numeric) error codes should be considered for inclusion in all fault messages. This may provide valuable reference when the text of the message itself is in a language not understood by the recipient.

When designing specifications intended for interoperability between vendors or implementations, consideration should be given to enumerating the possible faults in advance so that reference numbers can be universally and consistently referenced by disparate implementations.

Language and Context Negotiation Patterns

As noted above, there are four general patterns or policies that may be applied to any specific Web service. These four are:

Language Neutral

A language neutral service generally is one that executes the same regardless of the current runtime locale or user preferences regarding language. All or most strings embedded in the service are not human readable. This is, by far, the most common pattern. The expected behavior of this service is:

WSDL

The Web services description does not require any extra information in order to perform its operations or communicate its capabilities. Therefore no additional fields beyond those required by the actual service implementation should be defined in any of the bindings or operations.

The implementer should consider adding a language tag to any operation fault elements to show what language to expect fault messages to be generated in.

SOAP

No additional attributes or information is required for this pattern.

Faults

Fault messages should follow the guidelines outlined above.

Server Language

Exactly matches that of Language Neutral, except that the server's local language preference is expected to affect processing.

WSDL

The Web services description may provide an optional header field in the operation output element defining the natural language used by the service at execution time. Since this value is at least partially machine dependent, the value probably should not be set in the actual service description. Instead the soap:header or equivalent field should merely be defined as being available.

SOAP

This Web service pattern is generally similar to that of Language Neutral. Processes that are language or locale affected may retrieve their settings from the default machine settings and the SOAP Processor generally need not provide any special functionality to activate or achieve this.

Specific Language

A specific language preference is assigned to a specific service or port or binding of the service. This language and/or associated locale is expected to affect the way that the service performs its processing or the language of the string data contained in its outbound messages.

WSDL

Outbound message is defined with a specific attribute assigned a value defined by xml:lang. Inbound messages define no specific message information, since the language preference is fixed in advance.

SOAP

Outbound header contains a 'locale' or 'language' element with type xsd:language.

Faults

Faults are generated as with Language Neutral, except that the service specific language should also be generated after any user specified value, but before that of the SOAP Processor, assuming that these three are all different values.

User Specified

The service will attempt to match its processing to the language specified by the user. The service should therefore provide a way for the user's preference to be communicated, regardless of transport, and the actual value used to perform the processing should be returned in any response that may be generated.

WSDL

Both inbound and outbound headers define a attribute bound to type xsd:language and labelled with an xml:lang element.

SOAP

Inbound and outbound headers optionally take the element defined in the WSDL. If not present, the SOAP Processor's local language preference is used. Outbound headers should reflect as accurately as possible the actual value used in performing the processing. This may vary from that specified by the user according to the rules in RFC3066 (or its successor).

Faults

As with Specific Language, but with the user's preferred language or closest match in the position of the specific langauge in that example.

Multiple Language Resolution of Data

In some cases, the language of the data returned may need to be in more than one language or different targets may require different languages at the same time.

In the auction example, for example, different bidders might wish to view different language descriptions of the "offer".

In the systems management example, if a notification has more than one subscriber, the subscribers may have separate language preferences. In addition, the clients may not have the same language preference as the aggregating server process. How can the aggregating server receive notifications and log them, especially when the client subscription may not happen until later (and thus the language preference is unknown until after the notification has been received)? Designs should take this into account. Generally such localization should be delayed as long as possible. For example, the notifications might be made numeric, with data structures permitted for the transmission of specific data items. Then the aggregator could multiply resolve the notification to a human readable string representation for each consumer (its own log, different clients, etc.).

Passing or Matching International Preferences

This particular item is specific to this pattern. In the Online Auction scenario, the "offered item" from client A will probably have a human readable description or other locale- or internationally affected information associated with it.

Implementers may have to choose:

Natural Language Handling in Faults

SOAPFault is the mechanism used to deliver errors in the actual Web service processing. SOAP Faults in version 1.2 are allowed to contain only minimal fault information. Most of the value in a SOAP Fault message is in the actual human readable message returned in the fault structure.

In SOAP 1.2 this mechanism was extended to allow multiple languages to be returned and the use of xml:lang attributes was mandated for every returned entry. This mechanism is suitable for returning faults in an environment in which the number of languages is relatively small and the range of languages to be returned is known in advance.

Many actual SOAP implementations are localized into many languages simultaneously. To prevent faults from becoming overly large and difficult to manage, implementations should really include some strategy that reduces the set of languages to a minimum and attempts to match the language of the fault as closely as possible to the client that ends up viewing the message.

Ideal implementations will include mechanisms for "late localization" of the values.

Future versions of SOAP should probably consider allowing additional structured information in a Fault so that suitably internationalized clients can perform the localization and formatting themselves.

Ordering, Grouping, and Collation of Notifications

Some types of internationally sensitive processing cannot be inferred solely from a language identifier. Collation (sorting) is such a process. The collation may even affect the results that one gets from a Web service. For example, if the service selects "all offers > c", in certain locales you might receive back entries starting with "ch" (which is treated as a separate letter). Or if the service returns "all items < z", in certain locales this may not include accented letters such as å (A-RING).

<more here>

Service Description

Service descriptions are human-readable text intended to describe what the service does and how it should be used. To be useful, the description needs to be a natural language sentence or even a set of keywords in the language that the likely user audience will understand. The should be a way to tag the content with the specific language that it is in and to allow multiple languages. Otherwise false positives or negatives will result.

References

[1] [WS-Arch] Web Services Architecture Usage Scenarios

[2] [OMI] Open Management Interface

[3] [WSDL] Web Services Description Language, v1.2 (Working Draft)

[4] [OASIS WSDM] OASIS Web Services Distributed Management.

[5] [IETF RFC 2616] HTTP

Valid XHTML 1.0!Valid CSS!Addison Phillips (aphillips@webmethods.com)
$Id: ws-aggregation-pattern-scenario.html,v 1.10 2003/05/08 00:20:00 aphillip Exp $