Editor's draft 12/24/2011, see Overview for copyright, caveats, etc.
The primary distinction between this work and previous W3Canalyses of evolution of the web is that it specifically addresses the HTML-style cycle of experimentation and adoption. In particular, it is accepted that the breaking "the HTML cycle" isn't feasible, and that a new model of extensibility is needed.
The primary approach is to formalize the compatibility relationship between implementation, specification, and language.
This section primarily establishes a framework for discussing evolution in those terms. The terms used here have broad use in computer science and in W3C documents in particular, but the uses may vary. The goal of these definitions is to establish a common framework for this document of terms that it seems necessary to reduce some of the ambiguity. While some of the definitions here may not correspond to every use of the term, the goal of these definitions is intended to be precise enough for use in the rest of the document.
Three Strands: Language, Implementation, Specification
When considering evolution, three aspects evolve together, but not in lock step: languages, their implementations, and their specifications. That evolution is linked, but not in lock step. To make an analogy with natural languages, there are languages, dialects and words which evolve; there are dictionaries which describe or proscribe those languages, and there are people, groups, and communities which use those languages to communicate. Similarly, we can discuss evolution in terms of concepts:
- protocol
- a general term for a way in which agents interact, according to some set of rules and meaning; a [communication protocol] (HTTP is a protocol. SMTP is a protocol for sending mail.)
- language
- a term used in this document to cover [formal languages] and [file format]s used communication. (HTML, CSS are languages. JPEG and PNG are 'file formats', but considered 'languages' for this document.)
- protocol element
- a component of one or more protocols or languages defined or described independently. (A URI, the specification of a gradient, a HTTP request method, a date string: are protocol elements)
- implementation
- Software installed by agents to manage the interaction with others. (Internet Explorer, Apache HTTP, Squid are implementations)
- specification
- a technical document which describes (some part of) a protocol, language, or protocol element, and gives rules for how implementations of them are expected to behave. (an 'editor's draft', Internet draft, proposed standard are specifications)
- standard
- a specification published by a group as representing some level of agreement among those planning to build, maintain, or use the implementations of the protocols, languages and protocol elements. (a Recommendation and a IETF standards-track document is a standard)
Braided evolution
Although there is a strong relationship between how these strands evolve, they do not evolve in lock-step:
- protocols, languages, protocol elements
- evolve as their common implementations evolve.
- implementations
- evolve as their implementers of them create or adopt new features; in many cases, that evolution requires evolution or addition to the protocols, languages and protocol elements the implementations use to communicate.
- specification, standards
- Standards evolve as new (versions, editions) specifications are written, made available, reviewed, discussed, revised, and eventually are agreed to. A new specification might lead implementations (proposing additions or changing) or follow implementations where the specification has been changed to match implementation behavior. Standards evolve by accepting new specifications as a new edition or version of previous specifications.
Relationships and examples
- Multiple protocols and languages may use the same protocol element.(Many languages and protocols use URIs.)
- One protocol may use another. (HTTP may use TLS.)
- Protocol usually involve transmission between one agent (the sender) to another (the receiver), where the transmission is intended to be interpreted by the reciever as a protocol element or according to a language.
- A language can be thought of as a particularly complex protocol element.
- Using the same protocol element in multiple languages and protocols often facilitates linking multiple applications together.
- In a distributed system (consisting of multiple agents on a network), each agent uses implementations of the protocols and languages to interact with other agents.
- In the Web, for some languages, creation of content in that language involves developers typing out the language (e.g. writing JavaScript code) while in others, content is emitted by an implementation of the language.
- HTTP supports transmission of data in a language where the language to be used is indicated by the “content-type” protocol element.
- HTML is a language, the primary language used in the web. Other languages used in the Web are JPEG, GIF, and CSS. Note that one of the principle cases for this document is the observation that the languages used in the web are less like formal languages.
Conformance, Roles, and Normative Language
(possibly move this to separate section Normative)
When considering a specification and its relation to implementations and languages:
- role
- In a protocol, categories of the parties involved. For a language (or file format, etc.), the roles of reader, writer are implicit
- instance
- For a language (or file format), a file (or a transmission by a protocol) intended as an expression in a language. the language.
- conform
- For a protocol, an implementation of a role conforms to a specification if it behaves in a way that is consistent with the specification's description, and if it meets all of the normative requirements in the specification (for that role). For a language, an instance of the language conforms to a specification if it meets rules of the specification; implicit in this definition is the idea that the instance
- normative requirement
- an assertion within a specification (usually using [RFC2119] language of MUST, MAY, SHOULD), which set conformance expectations for one or more roles
- interoperable
- For a protocol, an implementation is interoperable with other implementations to the extent that, when acting according to the role it plays in the communication, it communicates with other implementations as expected.
Standards and Specifications
A protocol involves communication between multiple parties; in most situations, those parties fall into categories over some course of the communication: a sender, a reciever, a client, a server, a proxy, an intermediary, a sniffer, a spider, and so forth. A specification for a communication protocol will describe the roles involved and the normative requirements for them, as well as considerations for acheiving interoperability with other implementations.
The primary common purpose of developing a standard is to facilitate and insure interoperability. A well-written standard has the property that implementations that conform to it are interoperable with other implementations likely to be encountered.
An additional purpose for standards is when they provide a way of naming and determining implementation of features, and to provide a way of determining the non-feature requirements for accessibility, security, privacy, globalization, in the implementations that conform to the standard.
Compatibility and Interoperability
(this section will redefine some of the compatiibility terms from
http://www.w3.org/2001/tag/doc/versioning-strategies
to separate out implementation and specification compatibility; in particular, the case where a previous specification didn't match widely deployed implementations.)
Risks
(this section may not be needed, but the goal was to have a common definition of what the risks are to uncontrolled or unmanaged evolution)
Benefits of extensibility include experimentation, grown, while retaining stability, including new features and adaptation of existing ones. However, there are also risks of unintended consequences, some of which are outlined in [IAB-extension]. The risks are associated with deployment of implementations of extensions and new versions.
- interoperability problems
- The more options a protocol or language has, the more alternatives that are added, the harder it is to guarantee the unique benifit of global interoperability
- reverse engineering
- motivation of implementors to follow "leading" implementations
-
- security vulnerabilities
- Repeatedly standards organizations have seen when the combination of what looks like an innocuous extension into an unanticipated context resulting in security vulnerabilities of the combination. Security analysis is difficult, and the combinatorial analysis is even more difficult.
- instability of conformance
- If references, lists of allowable values, and combinations are allowed to vary over time, then the notion of "conformance" becomes difficult to maintain.
Evolution of Specifications and Standards
(this text was left over and may be redundant)
Managing evolution of standards in a world where there are multiple implementations involves coordinating the evolution of these different aspects. The process by which specifications become standards involves coordination between multiple parties, and significant review. Various means are used to assist the evolution of implementations while maintaining interoperability, without being unduly held back by the standards process.
There are several ways in which a specification can allow for evolution and extensibility of the language (or protocol or protocol element) it specifies, such that if the specification becomes a standard, the specified technology can still be extended or changed without re-writing or re-reviewing the specification. These include:
- modularization
- dividing a specification into multiple components to allow them to evolve independently
- references to evolving specifications in a series
- the normative references of a specification may point not to a particular edition or version of a specification or standard, but explicitly allow the reference to evolve.
- use of identifiers and registries
- as expanded in other sections of this document.
The basic problem is how to evolve so that the specification and implementation and language evolution are kept in sync while not making old conforming implementations non conforming, etc. [IAB-extension] has a discussion of costs & benefits, but it tries to separate 'routine' and 'major' extension categories, based on the impact adding a new identifier has on the base protocol/language. Some extensibility points have requirements that are not obvious or well-documented or well-understood, and could affect proper functioning in some way... if so, a process that has some qualification of whether it has passed meaningful review, whether someone other than the inventor of the registry item can update its specification. Lower cost of evolution, Preserve Interoperability, Matching reality, allow for private extensions, give implementers guidance about what is actually needed to be interoperable with other deployed systems, allow discovery of what is meaningful and important, insure the information is timely, doesn't go out of date, disappear, make sure that it is stable and evolvable at the same time. Manage transition from experimental to stable to standard,