Content Selection Primer 1.0

1 Introduction

Editorial note
Processes, 'in the large', the overall picture into which DISelect fits.

1.1 Structure of the specifications

Editorial note
Describe the related specifications and how they fit together

1.2 Separation of concerns

It is widely recognized, within the computer science community, that a good approach to solving a problem is to break it down into a set of sub-problems that overlap as little as possible. In the context of web page creation, one of the best examples of this is the decoupling of content from look and feel. This is typically achieved by using Cascading Style Sheets[CSS2] with a markup language, such as XHTML Version 2 [XHTML 2].

This principle is commonly known as "Separation of Concerns". Generally, solutions that follow this approach are less costly to develop and maintain and more flexible in use. In addition, the highly decoupled solution components that tend to result from the approach are more likely to be reusable. For example, changing the CSS style sheet used with a particular web page may make it possible to use the page in different circumstances without the need to change the markup.

Introduction of new features, such as DISelect, into a markup language can provide facilities that lead to the principle of separation of concerns being compromised. This is by no means restricted to DISelect, of course. For a long time, authors have been able to choose to use the <style> element, available throughout most of the family of HTML and XHTML markup languages, to embed their styling directly within their markup. This clearly breaches the principle. Just because something is possible, does not mean that it is desirable. There may be perfectly legitimate reasons for use of the <style> element within markup. However, most observers would probably conclude that it should be the exception rather than the norm.

The situation is very similar for DISelect. It is possible to create solutions that violate the principle of separation of concerns. However, it is not necessary to do that nor, in most circumstances, is it desirable. In this primer we will illustrate a variety of ways in which DISelect can be used. We will discuss issues of separation of concerns where they arise.

1.3 Adaptation

Adaptation is an important concept for systems that must support a wide variety of different devices with different capabilities and characteristics. By the way, the set of attributes and characteristics that are associated with the context in which materials are delivered to a device is known formally as the delivery context. Two main mechanisms are available by which adaptation can be performed. Either the most appropriate version of some material can be selected from a range of those available, or a version can be created by some kind of transformation. We'll look at both of these mechanisms in the sections that follow. A major role for DISelect is to provide a standard way of defining a set of alternative versions of materials that can be selected for use during adaptation.

1.3.1 Selection

Adaptation by selection involves choosing between different versions of materials according to some set of criteria. Often, selection involves picking one particular variant of a specific resource. For example, several different variants of a particular image might have been prepared to support different delivery contexts. During adaptation, one particular variant might be selected as the most appropriate to use on a particular mobile device. The criteria used would probably consist of specific device characteristics retrieved from the delivery context. Although, in this example, each variant is the same type of media, namely an image, this is by no means a limitation of adaptation by selection. In some delivery contexts, it may be preferable to select a variant of a completely different type.

1.3.2 Transformation

Adaptation by transformation involves creating a new version of some material according to some set of criteria. Returning to the example of images, there may be situations in which it is not appropriate or not feasible to create all of the required variants. An alternative approach is to create an appropriate variant from a single reference image. In this case, the transformation applied is usually known as image transcoding. In addition to changes in size and possibly color depth, transformations may also be required to encode and compress the resulting image differently, to suit the requirements of the delivery context.

It is, of course, quite possible for selection and transformation to be used together during adaptation. Specific variants may be created for particularly important use cases, and transformation used to handle less common cases.

2 Alternative capabilities for content selection

2.1 CSS

Various specifications for Cascading Style Sheets [CSS2] ,[CSS3-MQ] include facilities that can be used to control the rendering of content. We'll look at those facilities in this section.

2.1.1 Display property

CSS provides the display property which can be used to control whether markup is rendered or not. The property can take various values that control how the browser renders the markup to which it applies. In particular the value none causes the markup and its contents to be suppressed entirely. When combined with ways of controlling the expression of particular CSS statements, this capability offers a rudimentary means for content selection, as we will see shortly.

2.1.2 CSS Media types

Media types are a mechanism within CSS that allows different sets of style properties to be used under different circumstances. To use this mechanism, various parts of a stylesheet are enclosed and are made conditional. The CSS @media rule provides the expression that determines whether or not the CSS definitions that are enclosed are used. The expression is true, and the contained CSS is used, if the particular media type has been specified. Media types can be set from markup, using options on the <link> element used to reference the style sheet. For example, the link

                        
<link rel="stylesheet" type="text/css" media="print" href="foo.css"/>

causes the stylesheet foo.css to be used and enables definitions that are within the @media rule for print, for example

                        
@media print {
  div#abstract {display: none}
}

The result of this is that the <div> element with an id of abstract is prevented from being rendered. Any content it contains is also suppressed.

About ten different media types have been defined.

2.1.3 CSS Media queries

Media queries are an additional mechanism within CSS that extends the capabilities of @media with more powerful expressions. Rather than just a media type, the expressions can also contain conditions based on specific aspects of the delivery context. For example, the statement

                        
@media screen and (min-width: 400px) and (max-width: 700px) {...}

specifies that the contained CSS definitions apply only if the media type is screen and that the screen width is between 400 and 700 pixels.

About a dozen device characteristics have been defined which can be used in this way. They are termed media features.

2.1.4 Limitations of CSS content selection

Although CSS does clearly support the notion of content selection, the facilities are rather limited for general use in supporting the myriad of devices that are able to connect to the Web. In particular, the criteria for selection are rather limited. Only a few media types and media features are available. This is in stark contrast with the literally hundreds of different characteristics that normally form the delivery context. In addition, the media types and features are elements of the CSS syntax. This means that new versions of the CSS specification are needed in order to introduce new types and features.

While the CSS mechanisms are limited to specific features and media types, the use of general purpose XPath expressions in DISelect opens up the possibility of using information from other than just the delivery context. Access to the host document is supported specifically. In addition, it is possible to provide extension functions that access aspects of the current request, device state and other information about the environment.

There are also some practical difficulties in using the CSS-based mechanisms in mobile environments, where the majority of the device diversity is to be found. CSS traditionally executes within a browser. This means that all content, whether relevant for the device or not, must be sent to it before being suppressed. This has significant implications where there are cost, bandwidth and latency issues. These often occur on mobile phone networks.

2.2 HTTP content negotiation

The Hypertext Transfer Protocol [HTTP]includes various mechanisms that allow variations in the content that is returned in response to a request.

2.2.1 Representations and resources

HTTP separates the concept of a resource from that of a representation. It allows the content of a response to be a representation of the resource chosen according to algorithms implemented on the server. This is known as server-driven negotiation. It also allows for a second scheme where the user agent is presented with a list of available representations of the resource. The agent, possibly in conjunction with the user, is responsible for selecting the most appropriate representation from those available.

2.2.2 Server-driven negotiation

In server-driven negotiation, the server can use the following header fields from the request in determining the representation to be returned:

Accept: This header contains information about the media types that are acceptable in the response to the request.
Accept-Charset: This header contains information about the character sets that are acceptable in the response to the request.
Accept-Encoding: This header contains information about the content encodings that are acceptable in the response to the request.
Accept-Language: This header contains information about the natural languages that are acceptable in the response to the request.
User-Agent: This header contains information about the user agent making the request.

Clearly, this is a very limited set of information on which to base decisions about choice of representation and it leads to significant drawbacks with the basic approach. For example, there is no way to distinguish between images intended for small mobile devices from those intended for desktop computers if they both use the same MIME type. The HTTP specification itself draws attention to such limitations. It also points out the undesirable consequences of transferring a much more comprehensive set of characteristics from the user agent to the server on each request. For comparison, typical commercial adaptation solutions may keep hundreds of characteristics for each device that they support.

2.2.3 Adaptation as an extension of server-driven negotiation

The HTTP specification suggests that the User-Agent header may be useful in deciding on the representation to be used. Practical implementations of servers that employ adaptation (see 1.3 Adaptation) to support a myriad of devices, actually make extensive use of the User-Agent header to identify the user agent. In conjunction with a repository of device and user agent characteristics, this allows a server to associate much of the delivery context with each request, without requiring it to be presented explicitly by the user agent itself. Of course, if device or network characteristics are dynamic, or if user preferences are involved, a repository is inappropriate since it can only represent fixed quantities. Nevertheless, the combination of fixed properties implied from a repository and preference information from HTTP headers provides a viable basis for the sophisticated adaptation schemes in use on the Web today.

While HTTP provides the basic information which can be used within mechanisms that support server-driven negotiation, it says nothing about how the different representations might be specified within the server, nor about the algorithms that might be used to perform the selection. Such matters are left to individual Web servers and applications to determine. One use of DISelect is to provide a standardised mechanism by which representations are associated with resources at a server and by which the rules under which representations are selected are defined. We'll return to this point in 4.3 External selection.

2.3 XHTML 2 embedding attributes

XHTML 2 [XHTML 2] defines a basic method for content selection. This is based on the src attribute within the embedding attributes module. This attribute can be applied to virtually any XHTML 2 element. It references a URI which represents the resource to be used as the content for the element on which it appears. If a variant of the resource referenced by this attribute is returned when requested, it is treated as the content of the element, and any child markup of the element is discarded. However, if the request fails to return a variant, the content of the element is used. The content of the element within the XHTML 2 markup effectively provides a fallback in the event that the referenced resource does not provide an appropriate variant.

While this mechanism does provide some in-line content selection, it is limited really to selection between a representation returned via an HTTP request and the in-line markup. The limitations of content selection within HTTP have already been described (see 2.2 HTTP content negotiation).

2.4 SMIL

Synchronized Multimedia Integration Language [SMIL] defines mechanisms for content selection within its BasicContenControl module. Selection relies on specific named attributes and the switch element.

The simplest type of SMIL content selection uses just an attribute. Consider the following markup.

                    
 ...
 <par>
    <audio src="audio.rm"/>
    <video src="video.rm"/>
    <textstream src="stockticker.rt"/>
    <textstream src="closed-caps.rt" systemCaptions="on"/>
 </par>
 ...

The result of this construct is that the streams audio.rm, video.rm and stockticker.rt all start in parallel. The stream closed-caps.rt starts only if closed captioning is enabled. The meaning of attributes such as systemCaptions is interesting. They encapsulate implicit expressions as well as values. In the previous example, systemCaptions="on" means, express this element if the current preference for captions is on.

For other attributes, the implied expressions are different. For example, systemBitrate="28000" means that the enclosing element should be expressed if the current system bit rate is greater than or equal to 28000. The built-in attributes, known as System Test attributes in the SMIL specification, are, quite reasonably, oriented towards multimedia presentation control. They do not have particularly broad applicability to general content adaptation. There is an extension mechanism, by which new attributes can be defined. This allows new types of condition to be applied. The mechanism is opaque to the way in which expressions are evaluated.

The SMIL approach was examined during the development of DISelect. DISelect shares the notion of using attributes or elements for controlling selection but takes a slightly more explicit approach to the representation of the expressions that are used. DISelect is also constructed as a module, designed for inclusion within other language specifications.

3 Enriched capabilities with DISelect

As we've seen, there are capabilities in other W3C and IETF specifications that allow some level of control over the particular variant of a resource that is returned to a browser in response to a request. While these capabilities can support many of the use cases commonly found on the conventional Web, practical experience has found them somewhat lacking for supporting selection of material to be used to support the myriad of different kinds of device that are now able to access the Web.

We've already noted issues with existing mechanisms where two different devices both support the same media type, but have other requirements. For example consider images encoded using the Portable Network Graphics [PNG] specification being delivered to one device that is a tiny mobile phone, and another, which is a large, personal digital assistant with a large screen. Suppose that an author is required to provide different versions of a particular image in order to satisfy design criteria for a page that must be delivered to both devices. Different variants of the image are prepared to satisfy the criteria. However, because both devices support the same image encoding, PNG, content negotiation cannot be used to provide the appropriate version in this case. The criteria used in content negotiation are simply not sufficiently fine-grained to cater for even this simple level of selection.

Solutions that support a wide variety of different types of device require the ability to make use of a wider range of different alternative content variants, and employ a much richer set of criteria in connection with selection. We'll look at these aspects in the following sections

3.1 Richer set of alternatives

3.1.1 Wider variety of variants

The previous example illustrated a situation in which multiple variants of a particular resource may share a media type. This is a common situation. It is frequently the case that a set of variants of a media resource may share a media type. This is as true for audio and video types as it is for the images from the earlier example. Frequently, different variants of an audio or video stream may share a common media type, but may have been created using different amounts of compression. This allows different variants to be appropriate for the bandwidth available on the different networks over which they are transmitted to the devices.

There are also situations in which a completely different representation of the resource might be required. For example, consider again the situation where the variant is an image, but suppose now that it is actually a map. Variants that are also images may be appropriate for use on some devices. However, where a device has a particularly small screen, it may prove impossible to convey the required meaning, such as the location of a hotel. In this case, it may be better to use a completely different type of media. A set of instructions for travelling to the hotel might be preferable. These might be delivered in audio format or as simple text, depending on the capabilities of the device. Once again, there might be multiple variants of the same media type for each of these different kinds of media.

Generally, the large number of different types of device with different characteristics and capabilities requires the ability to support for a wide variety of variants for any resource..

3.2 Richer set of criteria for selection

We've seen that the need to support a myriad of devices requires the ability to support a wide range of variants. It also requires the ability to use a wide variety of types of selection criteria. We saw earlier that media types alone are insufficient for discrimination between variants. Many devices may support the same media type, but might be best served with variants that differ in other ways. These differences might affect properties such as size, color range, compression scheme, video frame rate or audio bit rate.

In the following sections, we'll look at the sources of criteria on which selection might be based.

3.2.1 Criteria from within the delivery context

The delivery context is a set of characteristics drawn from a number of sources, including the device itself and the network by which it is connected to the Web. Generally, any of the characteristic values from the delivery context might be used in selecting between the available variants. We'll examine the sources of the values in the delivery context in the following sections.

3.2.1.1 Device characteristics

The physical properties and basic capabilities of the device form a major part of the delivery context. Properties such as the physical size of the display, its color capabilities and whether or not it has a pointing device are good examples of such characteristics.

Other key characteristics may depend on software in the device. For example, the media types of image that it can display during browsing may be more a characteristic of its browser than its hardware. Sometimes this can lead to anomalies, where a characteristic may depend on the application that is performing the rendering. Generally, however, in the context of the DISelect specification, the application that is of major interest is the browser on the device.

3.2.1.2 Network characteristics

In addition to device characteristics, the characteristics of the network in use may be important in selecting between variants. For example, if a device is connected over a low bandwidth network, a more compressed variant of a media resource might be chosen than when a higher bandwidth network is available.

3.2.1.3 User preferences

It may be possible for device users to specify preferences concerning the way in which information is presented. For example, some users might elect to receive monochrome images or highly compressed video data, even though their device can support color and their network can support higher quality video. Some users might wish their device to use larger fonts, to aid readability, or to use or avoid particular colors to improve perceived contrast. Whatever the underlying reason, if the device allows such customisation, the resulting values can form part of the delivery context.

3.2.1.4 Dynamic characteristics

Network characteristics and user preferences are examples of characteristics that can be dynamic. They may change while the device is being used to access the Web. If such information changes, and those changes can be reflected in the delivery context, they can, in principle, be taken into account during adaptation. Whether or not a particular implementation of DISelect can utilise such characteristics depends on whether the changes are reflected in the delivery context in which it executes.

3.2.1.5 HTTP request context

We've already noted that HTTP provides headers that are potentially useful in adaptation (see 2.2.2 Server-driven negotiation). In addition, the HTTP specification [HTTP] mentions that it is also possible for a server to use any other information from the request, including parameters, in choosing the representation to be returned during server-driven negotiation. While such information does not replace the need for other delivery context, it is certainly a useful source of characteristic values that may not be available elsewhere. Natural language preferences, from the Accept-Language header, are a good example.

3.2.2 Criteria external to the delivery context

In addition to supporting criteria that originate within the delivery context, DISelect also supports other sources of information. These other sources are discussed in the following sections

3.2.2.1 Host language document

The full profile of DISelect allows content selection expressions to refer to the host language document. For example, if DISelect is being used within a DIAL [DIAL] document, content selection expressions can include references to the structure and content of that document.

3.2.2.2 Metadata associated with the document

Where the host language document also includes additional metadata, the full profile of DISelect allows content selection expressions to refer to that metadata. For example, since DIAL includes XHTML 2, content selection expressions can reference any metadata within the host document.

4 Content selection models

4.1 In-line selection

DISelect can be used embedded in-line within a host language document. This is the simplest arrangement for processing since it does not involve fetching additional external content. This model of content selection potentially suffers from a poor separation of concerns, if used carelessly. In this respect it is similar to the in-line content selection mechanisms in other specifications, for example SMIL (see 2.4 SMIL).Nevertheless, if used carefully, this form of selection can be useful. It also provides the simplest examples illustrating how DISelect can be used.

4.1.1 Example IL-01: Simple in-line selection

In the following example, a very basic device characteristic from the delivery context is used to alter the wording associated with the title of a press release.

                        
 <div class="company_item">
     <sel:select>
         <sel:when expr="dcn:cssmq-width('px') &gt; 200">
             <h3 class="company_header"> New York - July 19: Jive Announces New, High-performance Soccer 
                    Boot</h3>
         </sel:when>
         <sel:otherwise>
             <h3 class="company_header"> New, Soccer Boot</h3>
         </sel:otherwise>
     </sel:select>
     <p> New research from Jive Sports Research Labs on soccer boot technology has
...

This example illustrates part of a press release that has been written for use on a variety of different types of device. Press release titles often include information such as the date and the city where they were issued. This information, while possibly useful, is not germane to the release itself. It might be reasonable for an author to decide to eliminate it from the title, when a small device is in use. This might allow more of the display to be used for the content of the release. In the example, the author has used embedded content selection to cause the longer version of the title to be used only on devices with wider displays. The when clause includes an expression that evaluates to true if the usable width of the display is more than 200 pixels. The function dcn:cssmq-width() returns the usable width of the display in the requested units. The value px, specified in the parameter to the call, causes that value to be returned as the number of pixels. The characters > represent the way that the >symbol must be written to be syntactically valid within an XML document. The symbol > is reserved in XML to identify the start of an element.

The particular device characteristic chosen for this example provides a crude estimate of the amount of textual material that can be displayed. We'll look at alternative mechanisms, that are in use in practical systems, in 4.1.2 Example IL-02: Simple filtering.

4.1.2 Example IL-02: Simple filtering

In 4.1.1 Example IL-01: Simple in-line selection, the content selection applied only to the title of the press release. The same approach can be used to provide basic filtering that avoids sending too much information to a small device. In this example, a news story is sent in full to highly capable devices, but a summary is sent to small devices.

The first question to address is how to identify the capability of the device to which the article is to be sent. Capability in this sense is likely to involve some complex combination of memory size, processor performance and display size. While it would be feasible to compute a value that represents the capability of the device from the delivery context, in this example we'll use an alternative approach that is common in practical adaptation systems. This involves allocating devices to groups, according to their capabilities, and then providing materials to satisfy the groups. The creation and naming of such groups is not yet standardized. We need to postulate the existence of a suitable DISelect extension function that returns the name of the group to which the device belongs. In this example, the function is named eg:getGroupName(), where eg: is the prefix representing the namespace associated with the function. We'll also assume that the function returns one of the three values high, medium or low to indicate the capability of the device in displaying articles. In practical systems, there is usually a larger number of such groups.

                        
<div class="article">
    <h3>Boscastle Devastated by Floods</h3>
    <p class="abstract">
        The tiny Cornish village of Boscastle was devastated yesterday, by serious flooding caused
        by torrential rain.
    </p>
    <p class="summary" expr="eg:getGroupName() = 'high' or eg:getGroupName() = 'medium'">
        Properties were seriously damaged and vehicles were washed into the sea by the floods,
        which followed unprecedentedly heavy rain in the hills just inland of the village itself. Several buildings
        collapsed and many residents were rescued by helicopter.
    </p>
    <p class="body" expr="eg:getGroupName() = 'high'">
        In addition to houses and cottages, a number of shops in the village center have been badly
        damaged. It is likely to be months if not years before they will be able to reopen.
        Residents are fearful of the damage that may be caused to economy of the
        village. The floods struck at the height of the tourist season, which is
        crucial for the financial viability of many businesses in the village.
        .... 
    </p>   
</div> 
....

In this example, content selection is carried out using the attribute form. The expr attribute is added to host language elements to control whether or not they are expressed. The heading and the first paragraph are always sent to any device that requests the page. The summary, in the second paragraph, is sent to devices considered to have high or medium capabilities. Finally, the body of the article is sent only to devices with high capabilities. The result is that the content in the article is effectively filtered according to the capabilities of the device being used to access the page.

Although this is, once more, a simple example of the use of content selection, it does illustrate some interesting issues. For example, filtering content based on delivery context can be seen as a good thing or a bad thing depending on your point of view. Filtering out content that would be a problem for a particular device might be seen as improving access to material that would otherwise be unavailable. Such content might result in the device issuing error messages or might even result in some kind of failure. Alternatively, filtering might be seen as preventing users of particular devices having access to some of the information. In reality, of course, content selection, like most technologies, can be used both wisely and unwisely. We'll return to this discussion in 5.2 Adaptation and the ubiquitous Web. For now, we'll continue to assume that the reasons for undertaking selection are benign and that they increase access to material on the Web.

4.2 In-line selection with inclusion

So far we've looked at examples where the selection is in-line within the content. In this section we'll look at alternatives where the selection is decoupled in some way from the content in which the selected material appears.

4.2.1 Example IN-01: Controlled embedding

We can illustrate the use of DISelect for controlling embedding using a modification of 4.1.2 Example IL-02: Simple filtering.

                        
 <div class="article">
    <h3>Boscastle Devastated by Floods</h3>
    <xi:include href="summary.xml" 
        expr="eg:getGroupName() = 'high' or eg:getGroupName() = 'medium'"/>
    <xi:include href="body.xml" expr="eg:getGroupName() = 'high'"/>
</div> 
....

In this example, the content associated with the summary and body of the article is in separate resources with relative URLs summary.xml and body.xml respectively. They are included into the main document using XML Inclusions [XInclude]. We are, of course, assuming that the host language, in which DISelect is being used, also supports XInclude.

The selection expressions in this example are identical to those in 4.1.2 Example IL-02: Simple filtering. However, in this example, they control whether or not the two xi:include elements are expressed. Once more, we assume the existence of the extension function eg:getGroupName() that returns one of the three values high, medium or low to indicate the capability of the device in displaying articles. The inclusion of the summary takes place if the device is considered to have high or medium level capabilities. The inclusion of the body takes place only for devices considered to have high capabilities.

This example illustrates a little greater separation of concerns than that in 4.1.2 Example IL-02: Simple filtering. The selection occurs within the scope of the referring resource. The content being selected contains no explicit selection statements. To some extent, selection and content have been separated. In this case, however, the selection is still in-line in the referring resource. It is, of course, possible to apply selection within the resource being referenced. We'll look at this approach in the next section.

4.3 External selection

In this section we'll look at mechanisms that delegate content selection to the resources being referenced. The implementation of such delegation assumes that the delivery context, on which the selection decisions are made, is available to the referenced resources. This is normally the case in workable adaptation systems.

4.3.1 Example ES-01: Embedding externally selected content

In this example, DISelect is used to control the use of particular CSS style sheets. The decision is based on the contents of the delivery context. This allows much more granularity in the way decisions between styles are made than, for example, using CSS Media Types (see 2.1.2 CSS Media types) or CSS Media Queries (see 2.1.3 CSS Media queries). It also allows the choice of styling to be made without affecting the content in any way, providing a strong separation of concerns. Once again, we are assuming that the host language in which DISelect is embedded supports XInclude [XInclude].

                        
<html>
  <head>
    <title>Boscastle</title>
    ...
    <xi:include href="../style/articlestyle.xml"/>
    ...
  </head>                        
  <body>
  ...                      
                        
  <div class="article">
    <h3>Boscastle Devastated by Floods</h3>
    <p class="abstract">
        The tiny Cornish village of Boscastle was devastated yesterday, by serious flooding caused
        by torrential rain. 
        ....

In this page, the xi:xinclude element embeds a resource at relative URL ../style/articlestyle.xml. This resource defines the way in which style sheets are selected for use with the page. It uses DISelect. Here is the markup for the resource.

                        
                        
 ...
  <sel:select>
       <sel:when expr="eg:getStyleSheetSupport() = 'excellent'">
          <link rel="stylesheet" type="text/css" href="../styles/sensational.css"/>
       </sel:when>
       <sel:when expr="eg:getStyleSheetSupport() = 'basic'">
          <link rel="stylesheet" type="text/css" href="../styles/mediocre.css"/>
       </sel:when>
   </sel:select>
 ...

In this case, the result of the selection is to cause the page to include a reference to the CSS style sheets sensational.css or mediocre.css or to omit the reference altogether. The xi:xinclude statement in the page is replaced by the contents of the articlestyle.xml resource. Again, we've assumed the existence of an extension function that returns the appropriate value for testing. In this case it's the function eg:getStyleSheetSupport(). If the device's CSS support is considered excellent, the style sheet sensational.css is used, while if it is considered basic, mediocre.css is used. Other values prevent a style sheet reference from being included in the page.

For simplicity, we've omitted details relating to the host language in which the DISelect elements in articlestyle.xml are embedded. We've also glossed over issues associated with embedding arbitrary XML fragments using XInclude. That is deliberate, since in this primer we are trying to illustrate possible uses for DISelect, rather than providing a particular specification. Such matters are topics for the host languages that make use of DISelect.

The use of the mythical extension function eg:getStyleSheetSupport() in this example is also for simplicity. In reality, the choice of style sheet is complex and may go beyond the contents of the CSS itself. For example, some devices operate better with external style sheets and some with in-line style sheets. Some devices support subsets of specific CSS versions, and may require different style selectors and properties to achieve the author's intent. Finally, some devices do not support CSS at all. These types of issue are matters for processors that provide adaptation. DISelect can play a role by allowing authors to associate particular stylesheets with specific devices, groups of devices, or other delivery context properties. So while the selection criteria, in this example, are unrealistically simple, the approach is not.

4.3.2 Example ES-02: Syntactic sugar for generic resources

URIs represent resources. In 1.3.1 Selection we noted that resources can occur in multiple variants and that one aspect of adaptation was selection between such alternative representations. The existence of multiple variants of a resource suggests that there may be at least two types of usage for URIs. One usage references the resource without regard to a particular representation while the other type references a particular variant. To distinguish these types of resources and URIs, the term generic resource [Generic Resource] was introduced. A generic resource is a resource that is well specified, as a concept, but not so completely defined that it can be represented by only a single bit stream. Variants of a resource may vary for a number of different reasons.

Representations of a resource that represents today's weather forecast vary with time.
Multiple representations exist when a particular document is translated into multiple languages.
Multiple representations of a particular piece of media, such as a video clip, may exist to satisfy the needs of different kinds of device.

In each case it is clearly desirable to be able to refer both to the generic resource and to the specific variants using URIs.

In 4.3.1 Example ES-01: Embedding externally selected content, selection takes place externally but the results are explicitly embedded into the page using XInclude. The same selection mechanism can be employed but the syntax of the page reference can be improved if the reference is expressed as a URI to a generic resource.

                        
<html>
  <head>
    <title>Boscastle</title>
    ...
    <link rel="eg:genericStyleSheet" href="../styles/articlestyle.xml"/>

    ...
  </head>                        
  <body>
  ...                      
                        
  <div class="article">
    <h3>Boscastle Devastated by Floods</h3>
    <p class="abstract">
        The tiny Cornish village of Boscastle was devastated yesterday, by serious flooding caused
        by torrential rain. 
        ....

In this version of the page, the xi:xinclude element has been replaced with a link element that refers to the articlestyle.xml resource. The rel attribute on the link defines the relationship with this resource. We've assumed the existence of a value eg:genericStyleSheet which means that the referenced resource is generic and represents the stylesheet for the page. Because this resource is generic, some method of adapting it to the delivery context must be applied before it can be used. DISelect provides a convenient way to describe the selection criteria.

                        
                        
 ...
  <sel:select>
       <sel:when expr="eg:getStyleSheetSupport() = 'excellent'">
          <link rel="stylesheet" type="text/css" href="../styles/sensational.css"/>
       </sel:when>
       <sel:when expr="eg:getStyleSheetSupport() = 'basic'">
          <link rel="stylesheet" type="text/css" href="../styles/mediocre.css"/>
       </sel:when>
   </sel:select>
 ...

This markup is in fact identical to that in 4.3.1 Example ES-01: Embedding externally selected content. The only difference in this entire example is the syntactic sugar provided by the link element in the page itself. Clearly, the processor adapting the markup in the page must be able to interpret and process the link element within the head in an appropriate way, for this mechanism to work.

While DISelect provides a mechanism for making the selection between variants, it is not in itself an appropriate way to solve the problem of discovery of variants and generic resources [Link Gen Resource].

As with the earlier example of external selection, this approach also maintains a strong separation of concerns between the page markup and the set of variants that provide the style sheets for use with the device making the request. An advantage with such syntactic sugar is that it obviates the need for the host language explicitly to support extensions such as XInclude. This may make it easier to use this approach with existing markup languages.

4.3.3 Example CN-01: Implementing content negotiation

Modern web servers and web application servers support HTTP content negotiation. We looked at the capabilities in 2.2 HTTP content negotiation. We noted that HTTP content negotiation is able to use only a small fraction of the information available in the delivery context for selecting the particular variant of a resource that will be served.

In addition to the limitations in criteria that can be used for selection, the representation of those criteria is server-specific. For example, the very widely used Apache web server [Apache HTTPD], employs particular Apache configuration files called type maps. These use an Apache-specific syntax to define the relationships between specific variants and the generic resource. Here is an example of a set of entries from a type map that allows the most appropriate version of an image to be chosen based on the content type that a device can support.

                        
                        
URI: foo

URI: foo.jpeg
Content-type: image/jpeg; qs=0.8

URI: foo.gif
Content-type: image/gif; qs=0.5

URI: foo.txt
Content-type: text/plain; qs=0.01

This entry relates the generic resource with URI foo with three variants of differing image content type. Each variant is also given a source quality value, using the qs parameter. These allow the algorithm to choose between variants when the device is able to accept more than one content type.

DISelect offers an alternative mechanism for this particular kind of selection and, in addition, provides the ability to use any information from the delivery context in making the decision.

                        
                        
 ...
  <eg:resource href="foo">          
    <sel:select>
      <sel:when expr="eg:supportsContent('image/jpeg')">
        <eg:variant href="foo.jpeg" />    
      </sel:when>
      <sel:when expr="eg:supportsContent('image/gif')">
        <eg:variant href="foo.gif" />    
      </sel:when>
      <sel:when expr="eg:supportsContent('text/plain')">
        <eg:variant href="foo.txt" />    
      </sel:when>
    </sel:select>
  </eg:resource>                        
 ...

In this example, we assume that DISelect is embedded in a host language that explicitly represents the variants associated with a resource. The eg:resource element represents the resource. Each eg:variant element represents one variant of that resource. Once again, we've assumed a specific DISelect extension function ge:supportsContent that returns true if the device supports the specified content type. Note that there is no need for explicit use of the source quality mechanism in this particular case. By default, the first sel:when element whose expression evaluates to true is chosen. The order of the sel:when elements has the same effect as that of the qs parameter in the Apache type map.

Readers familiar with the HTTP specification will probably already have spotted that the DISelect version of the selection is not equivalent to that in the Apache type map for those situations where the request includes explicit use of the quality parameter in the HTTP accept header. This parameter allows authors to express a preference for particular content types over others when both are available. This mechanism, which needs to be applied every time a resource is requested, is effectively superseded by the availability of information about device capabilities in the delivery context. It's worth noting that this is only the case for characteristics of the device. User preferences, such as language, cannot be treated in this way.

This example has shown how DISelect might be used to provide features similar to that used by content negotiation in web servers. In the following sections we'll see how this might be used in markup generated during adaptation.

4.3.4 Example SF-01: Content negotiation independent of the referring resource

So far in the examples, we've looked at cases where the adaptation is complete by the time the page is returned to the requesting device. In this example. we'll look at a case where further adaptation happens even after device-specific markup has been generated.

Let's suppose that the following fragment is part of a page authored in DIAL. It includes a reference to an image showing the devastation caused by the floods.

                        
  ...                      
                        
  <div class="article">
    <h3>Boscastle Devastated by Floods</h3>
    <img src="mainstreet.xml" alt="Boscastle's main street under water"/>
    <p class="abstract">
        The tiny Cornish village of Boscastle was devastated yesterday, by serious flooding caused
        by torrential rain. 
        ....

Let's suppose that an adaptation processor transforms the page in which this fragment appears into XHTML Basic[XHTML Basic]. The fragment itself is actually unchanged, since as well as being valid DIAL, it is also valid XHTML Basic.

The adapted page, including this fragment, is sent to the device. While processing it, the user agent on the device parses the img element and makes a separate request for the image mainstreet.xml. At the server, the resource mainstreet.xml and its variants are defined as follows:

                        
                        
 ...
  <eg:resource href="mainstreet.xml">          
    <sel:select>
      <sel:when expr="dcn:cssmq-width('px') &gt; 200">                
        <sel:select>
          <sel:when expr="eg:supportsContent('image/jpeg')">
            <eg:variant href="bigmsflood.jpeg" />    
          </sel:when>
          <sel:when expr="eg:supportsContent('image/gif')">
            <eg:variant href="bigmsflood.gif" />    
          </sel:when>
        </sel:select>
      </sel:when>
      <sel:when expr="dcn:cssmq-width('px') &gt; 100"> 
       <sel:select>
          <sel:when expr="eg:supportsContent('image/jpeg')">
            <eg:variant href="smallmsflood.jpeg" />    
          </sel:when>
          <sel:when expr="eg:supportsContent('image/gif')">
            <eg:variant href="smallmsflood.gif" />    
          </sel:when>
        </sel:select>
      </sel:when>
    </sel:select>
  </eg:resource>                        
 ...

In this example, both the type and size of the image are subject to selection. The outer sel:select element chooses between two sets of images based on the width of the display in pixels. We've seen already that the standard DISelect function calldcn:cssmq-width() returns the usable width of the display in pixels. If the display is wider than 200 pixels, one set of variants is used. If it is between 101 and 200 pixels wide, a different set is used. Within those sets, specific variants are chosen according to the image types supported by the display. If a variant is available, according to the criteria used, the adaptation processor returns it to the user agent.

There are two situations within this nested selection that could result in an image not being returned. First, if the display is 100 pixels wide or less, no variant will be selected. Second, if the device does not support either JPEG or GIF images, no variant will be selected. The result is that the user agent will not receive and image and the alternate text will be shown instead.

This example shows a very strong separation of concerns. The content selection does not occur anywhere in the content of the page. It is delegated to the way in which a particular image variant is selected when requested by the user agent.

5 Additional considerations

5.1 Content selection and adaptation

Content selection, as implemented with DISelect, plays a small but important role within the wider field of adaptation. It is one way in which authors can express their intent about the way in which their content is used.

Adaptation itself is a much broader topic, of course. Much of it relates to the way in which an author's intent is expressed in materials sent to specific devices. Implementations of adaptation processors often automate the detailed work associated with achieving the author's intent on specific devices. These often include working around limitations in the device, emulating missing features and overcoming incomplete implementations of standards. It seems unlikely that there will be agreed, standard representations for such detailed analysis of device behavior in the near future. At present, it seems difficult enough to get agreement on how to represent the size of the display on a device. However, while such matters can be considered implementational details at present, it is important that there are standard ways of representing author intent. DISelect plays a small part in such representations, as we have seen.

5.2 Adaptation and the ubiquitous Web

We noted earlier that the process of filtering content based on delivery context can be seen as a good thing or a bad thing depending on your point of view. Filtering out content that would be a problem for a particular device might be seen as improving access to material that would otherwise be unavailable. Such content might result in the device issuing error messages or might even result in some kind of failure.

Alternatively, filtering might be seen as preventing some users having access to some of the information, a situation which seems immediately at odds with the overall goals of ubiquitous access. Of course, it's not as simple as that. Filtering that prevents children from receiving content inappropriate to their age would probably be seen as protecting the user. Filtering that allows a site to work on a mobile device but that prevents access by a user with a specific disability would rightly be seen as discriminatory.

As with most tools, it's not content selection itself, but the way in which it is used that leads to such issues. This is true for the Web in general, of course. The existence of the Web Accessibility Initiative [WAI] and the steps that some governments have taken, in introducing related legislation, make that clear. Content selection, and indeed all aspects of adaptation, provide authors with the means by which they can ensure that materials delivered to particular devices will result in a functional user experience. It does not absolve them, however, of ensuring that materials are available to support their community of users. Adaptation is an enabler for ubiquitous access to the Web. It cannot, on its own, guarantee that access, however.

5.2.1 Author proposes, user disposes

One principle which has helped in solving issues of Web accessibility goes by the title "author proposes, user disposes". The idea is that authors make materials available to users but that ultimately, users are in control of the final rendering. Desktop browsers, for example, usually offer the ability to zoom the display of a web page to assist users with visual disabilities. Browsers may also allow users to set up style sheets that override those provided by a page's author. These can alter colors, sizes and other aspects of the presentation.

While this principle has undoubtedly proved useful, it relies on the end user taking specific action in order to be able to perceive a site in the way most appropriate for them. Adaptation might offer additional approaches that could, over time, reduce the need for users to have to invest time in tailoring the user experience.

It has been noted on many occasions that the delivery context contains more than just device-specific characteristics. It also contains information relating to networks and intermediary services and may include user preferences. If such preferences were sufficiently comprehensive and if the abstractions used by authors sufficiently powerful, it might be possible to deliver accessible representations to users with disabilities automatically. It might be possible to adapt a site to make best use of a particular attached screen reader or to provide a version in colors and sizes most comfortable for a particular user without the need for explicit tailoring. Such capabilities are still in the future, but the technologies that underpin them are being developed. Issues still remain. One of the most difficult is the protection of those parts of the delivery context that a particular user might consider sensitive. Despite the challenges, the goal is worth pursuing.

Feature	Examples
select element	IL-01 , ES-02, CN-01, SF-01
when element	IL-01 , ES-02, CN-01, SF-01
otherwise element	IL-01
nested select elements	SF-01
if element	TBD
variable element	TBD
variables in expressions	TBD
type conversion in expressions	TBD
profile names	TBD
versions	TBD
accessing the host document	TBD
path-based access to the delivery context	TBD
dcn:cssmq-width() function	IL-01
expr attribute in the host language	IL-02, IN-01, ES-01
default values	TBD
extension functions	IL-02, IN-01, ES-01
selid attribute	TBD
selidname attribute	TBD