Web Services in a Web Company

Hugo Haas and Mark Nottingham, Yahoo! Inc.

Services on the Web, or “Web services”, is a vague term, but in practice it usually means either SOAP-based or HTTP-based machine-to-machine communication.

Yahoo! uses both types internally and externally, facing different challenges with each.

SOAP-Based Services

SOAP-based Web services offer easy provision and invocation of services through extensive tooling focused on developer productivity. Parts of Yahoo! have deployed SOAP for communication within a single group, between groups, and with external vendors and clients with success.

However, some interoperability issues have been faced when building services with complex messages, prompting certain groups to provide client libraries for a variety of languages in addition to the WSDL description that can be consumed by openly available tools.

Additionally, neither the extensibility of the SOAP envelope ("WS-*") nor the abstraction of the transport binding are often exercised. SOAP Web services are almost always on top of HTTP, and WS-Security, one of the very few standard SOAP extension used, is not available in all the toolkits.

As a result, our primary interest in SOAP is the benefits that specific tools bring, rather than the benefits of the overall stack, whose interoperability and complexity problems are well-documented elsewhere.

HTTP-Based Services

HTTP-based services are more predominant at Yahoo!, especially externally, as illustrated on the Yahoo! Developer Network.

They are especially attractive because of their simplicity, familiarity for Web developers, and broad tool support, especially with dynamic languages like Perl, Python and PHP.

That said, we’ve encountered many issues in designing, deploying and maintaining HTTP services. We believe that they are solvable, with appropriate expenditure of effort and coodination in the community.

As an example, the most pressing problem we see today for HTTP-based services is authentication. The mechanisms defined in RFC2617 fail to meet even the basic requirements of the modern Web.

One problem (of many) is that they are limited to one set of credentials. This disallows multi-party authentication; for example, where both the end user and a partner need to be identified. Other problems with and requirements for Web authentication have been documented elsewhere, including in Yahoo!'s submission to the W3C Workshop on Usability and Transparency of Web Authentication.

The most common workaround for these shortcomings is to use cookies, which is attractive because it puts both the choice of algorithms as well as the burden for correct implementation onto the server. However, this removes the possibility of using common tools like XSLT processors (e.g., in the document() function), Atom aggregators and CalDAV clients.

As a result, one of the more interoperable ways to pass credentials for HTTP services is through the URI — which is clearly in conflict with the Web architecture. At the least, using Cookies for authentication needs to be acknowledged in the Web architecture, until a better solution can be found and widely deployed.

The Importance of Tooling, and the W3C’s Role

Well-architected, clearly-documented specifications for the Web are useless without implementations that follow them, in letter and spirit. This is especially true for services, where so many decisions are driven by the tools at hand — and those tools are often built for the Web as practised in the 1990’s.

Thus, tooling is the primary concern of architects when choosing a strategy to implement Web services.

W3C's work on XML Schema Patterns is a good step for users to work around issues with databinding tools and SOAP toolkits. However, we can only regret the lack of vendor participation in this effort.

The W3C has already helped HTTP-based services in various ways, such as by producing the Web Architecture, various TAG findings, and the Common User Agent Problems. There is also promising new work, such as the Web APIs WG, that should help improve interoperability for services on the Web.

However, there is only so much that the W3C can do without direct involvement of tool authors. Because of the structure and history of the organization, it may not be possible for the W3C to be the sole venue for such work; rather, it requires careful coordination with and outreach to organizations like the IETF, Apache Project and Mozilla Foundation, as well as to vendors. In doing so, we see the W3C acting to coordinate activity, safeguard the architecture and assure that the Web stack is well-integrated — roles that it has played well in the past.