W3C

List of comments on “Content Transformation Guidelines 1.0” (dated 1 August 2008)

Quick access to

There are 85 comments (sorted by their types, and the section they are about).

1-20 21-40 41-60 61-80 81-85

general comment comments

Comment LC-2043
Commenter: Mark Baker <distobj@acm.org> (archived message)
Context: Document as a whole
assigned to Jo Rabin
Resolution status:

Here's my comments. In summary, the group really needs to decide
whether this is a guidelines document, or a protocol. It can't be
both. A lot of work remains.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2025
Commenter: casays <casays@yahoo.com> (archived message)
Context: Document as a whole
Not assigned
Resolution status:

In general, I must state unequivocally that our experience with
current transformation proxies deployed throughout the world has
always been negative, since all proxies seem to transform original
mobile content regardless, with results ranging from passable to
outrageously unusable. The draft, while an interesting attempt to
bring some order in the wild practices that abound in the mobile Web,
is still vague and incomplete in several points, and thus, in its
present form, may not stem some of the more egregious forms of
transcoding we have witnessed so far.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2097: OPES
Commenter: Barry Leiba <leiba@watson.ibm.com> on behalf of IAB (archived message)
Context: Document as a whole
assigned to Jo Rabin
Resolution status:

To the W3C Mobile Web Best Practices Working Group:

The Internet Architecture Board has reviewed the subject document, and notes that
it has previously reviewed related work done in the IETF in the Open Pluggable
Edge Services (OPES) Working Group. In its preview and review of OPES work, the
IAB expressed its concerns about privacy, control, monitoring, and accountability
of such services in RFC 3238 [ http://tools.ietf.org/html/rfc3238 ].

We have no specific architectural concerns with the "Content Transformation
Guidelines" document as written; it does seem to take into account the questions
raised during the OPES discussions. We would like, though, to make that explicit
by specifically documenting that you reviewed and considered the issues in RFC
3238.

Barry Leiba, for the Internet Architecture Board ( http://iab.org )
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

substantive comments

Comment LC-2065
Commenter: Dennis Bournique <db@wapreview.com> (archived message)
Context: Document as a whole
assigned to Bryan Sullivan
Resolution status:

My name is Dennis Bournique. I write about mobile browsing, primarily from a user perspective, at http://wapreview.com. I've done a little web development, mostly mobile specific sites, but I'm by no means an expert on the technical side of this issue.

Putting on my user hat, I'd like to make a request that the Content Transformation Guidelines include a requirement that content transformation proxies "must" provide end users (consumers of web content) with a way to turn off transformation both globally and on a site by site basis.

As an end user, I’ve experienced both the joys and the frustrations of using content transformation proxies.

In general, I believe in content transformation as a valuable tool to make web content, which would otherwise be difficult or impossible to use, available through the limited browsers of many mobile phones.

I have also been frustrated when a carrier or content provider unilaterally imposes content transformation with no way for me to disable it. I've been unable to access content through content transformation proxies that was previously available on the same device using a direct connection. This has happened both with installable content such as midlets and ringtones and also with pure html and xhtml pages, including mobile optimized pages and those that are not. I have also seen my secure end to end HTTPS traffic being forced through content transformation proxies, exposing it to the potential for a "man in the middle" attack.

I understand that the Guidelines are intended to prevent these sorts of problems by specifying when content transformation proxies must allow content to flow directly between server and user agent without modification. This is good, but no technical solution can ever be perfect. There will always be edge cases where content transformation does more harm than good. For this reason it is important that end users have the option to opt-out of content transformation.

I propose that the Guidelines be amended to include the following or similar language.

"...1. Content transformation proxies, if they are modifying traffic between a server and a user agent in any way, MUST provide a mechanism allowing the end user to resubmit the request and disable content transformation for the duration of the current session."

"...2. Content transformation proxies, must provide a means for end users of that proxy to disable all content transformation until they take explicit action to re-enable it."
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2089: Basic Content Tasting
Commenter: Heiko Gerlach <heiko.gerlach@vodafone.com>
Context: in
assigned to Jo Rabin
Resolution status:

Old text
Request resource with original headers

If the response is a 406 response:

If the response contains Cache-Control: no-transform, forward it

Otherwise re-request with altered headers

If the response is a 200 response:

If the response contains Vary: User-Agent, an appropriate link element or header, or Cache-Control: no-transform, forward it

Otherwise assess whether the 200 response is a form of "Request Unacceptable"

If it is not, forward it

If it is, re-request with altered headers

BUT WHERE IS THE TRANSCODING?
New Text:

Request resource with original headers

If the response is a 406 response:

If the response contains Cache-Control: no-transform, forward it

Otherwise re-request with altered headers

If the response is a 200 response:

If the response contains Vary: User-Agent, an appropriate link element or header, or Cache-Control: no-transform, forward it

Otherwise assess whether the 200 response is a form of "Request Unacceptable"

If it is not, TRANSCODE it

If it is, re-request with altered headers
B.1
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2018
Commenter: C. M. Sperberg-McQueen <cmsmcq@acm.org> (archived message)
Context: Content Transformation Guidelines 1.0
assigned to Sean Patterson
Resolution status:

Would it perhaps be better to give this specification a more informative
title, or at least some sort of informative subtitle?

The phrase "Content Transformation" sounds to an uninitiated reader
as if it could apply to anything from the use of the data manipulation
language (e.g. SQL) in a database management system, to the use of
XSLT, or the SAX or DOM interfaces, to transform XML documents, to
the use of dynamic HTML techniques to transform data in the browser.

Perhaps "Mobile Web Content Transformation"? Or "Content Transformation
for Mobile Presentation"? Surely there are ways of making it easier
for potential readers to see whether the document is relevant to their
concerns or not.

This isn't the first W3C spec to have such a generic title; the
experience
of the XML Schema specification, however, leads me to commend to you
urgently the wisdom of have a more specific, more informative, less
generic title for your document.

--Michael Sperberg-McQueen
W3C XML Activity
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2050
Commenter: casays <casays@yahoo.com> (archived message)
Context: 2.2 Types of Transformation (Alteration of requests)
assigned to Sean Patterson
Resolution status:

1) Section 2.2.1:

The CTG distinguishes between retructuring, recoding and
optimization. This is a useful approach, and the distinction
could be used more systematically across the document. However,
without a formal definition of these terms, various parties
are left with too much leeway when classifying some operations
one or the other of the categories. This may entail inconsistencies
regarding the interpretation of the guidelines.

The guidelines should:
a) Define formally each of the three categories, possibly on
the basis of language theory. As an example, optimization seems
to be related to equivalent token streams (for textual content),
whereas recoding seems to deal with equivalent parse trees. Some
operations are reversible, others are not. The W3C is home to
technologies such as XSLT, so there should be competence there
to help ground definitions on solid formal concepts. Basing such
definitions on formal language theory is a suggestion, not a
requirement; other formally grounded definitions are possible.
b) Define exactly how to classify an operation that spans several
categories. As an example, converting HTML to XHTML while at the
same time eliminating comments and redundant white space should
amount to a recoding.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2067
Commenter: Mark Nottingham <mnot@mnot.net> (archived message)
Context: 3.4 Content Deployment Conformance (Also concerns "3.5 Transformation Deployment Conformance")
assigned to François Daoust
Resolution status:

* Section 3.4 / 3.5 "A [Content|Transformation] Deployment conforms to
these guidelines if it follows the statements..." What does "follows"
mean here -- if they conform to all MUST level requirements? SHOULD
and MUST?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2003
Commenter: Luca Passani <passani@eunet.no> (archived message)
Context: 4.1 Proxy Forwarding of Request
assigned to Jo Rabin
Resolution status:

Also, I see that CTG does not mention "whitelists". I think it should, since many transcoders manage that. The rule (consistently with the concept that transcoders must err on the side of not transcoding) should be that whitelists can only specify which potentially mobile sites can be forced to be trascoded (and not the other way around as happens to be common today, thus potentially forcing mobile developers to ask operators in different countries to whitelist their service, which is of course unacceptable).
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2019
Commenter: casays <casays@yahoo.com> (archived message)
Context: 4.1.1 Applicable HTTP Methods
Not assigned
Resolution status:

1) Section 4.1.1

Add to the section:

Proxies MUST NOT convert POST methods into GET ones, or vice-versa.

Rationale: This kind of transformation may make exchanges between
clients and servers inoperative. In particular, this kind of
substitution has been known to cause problems for content downloading
applications in the mobile Web.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2034
Commenter: Mark Baker <distobj@acm.org> (archived message)
Context: 4.1.1 Applicable HTTP Methods
Not assigned
Resolution status:

4.1.1 Applicable HTTP Methods

"Proxies should not intervene in methods other than GET, POST, HEAD and PUT."

I can't think of any good reason for that. If a request using an
extension method wants to avoid transformation, it can always include
the no-transform directive.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2044
Commenter: Jim Jewett <jimjjewett@gmail.com> (archived message)
Context: 4.1.3 Treatment of Requesters that are not Web browsers
assigned to Bryan Sullivan
Resolution status:

http://www.w3.org/TR/2008/WD-ct-guidelines-20080801/


Section 4.1.3 ...

""'
The mechanism by which a proxy recognizes the user agent as a Web
browser should use evidence from the HTTP request, in particular the
User-Agent and Accept headers.
"""

Please clarify -- is this just the *existence* of those headers, or
the specific values? If it is the specific values, then please
provide some guidance (or a normative alternative) that new user
agents can use, before their names propagate to various whitelists.

-jJ
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2069
Commenter: Mark Nottingham <mnot@mnot.net> (archived message)
Context: 4.1.3 Treatment of Requesters that are not Web browsers
assigned to Bryan Sullivan
Resolution status:

* Section 4.1.3 "Proxies must act as though a no-transform directive
is present (see 4.1.2 no-transform directive in Request) unless they
are able positively to determine that the user agent is a Web
browser." How do they positively" determine this? Using heuristics is
far from a guaranteed mechanism. Moreover, what is the reasoning
behind this? If the intent is to only allow transformation of content
intended for presentation to humans, it would be better to say that.
In any case, putting a MUST-level requirement on this seems strange.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2038
Commenter: Mark Baker <distobj@acm.org> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

The rest of the 4.1.5.* sections all seem to be basically "Here's some
things that some proxies do". By listing them, are you saying these
are good and useful things, i.e. best practices? If so, perhaps that
should be made explicit.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2071
Commenter: Mark Nottingham <mnot@mnot.net> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

* Section 4.1.5 Bullet points one and 3 are get-out-of-jail-free cards
for non-transparent proxies to ignore no-transform and do other anti-
social things. They should either be tightened up considerably, or
removed.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2053
Commenter: casays <casays@yahoo.com> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

5) Section 4.1.5.

Statement to be added:

"In so far as the transformation carried out by the proxies is
to make content intended for a certain class A of devices
available to devices of another class B, then requests MUST NOT
be modified whenever a client of a certain class is accessing
content intended for its class.

If the class of request (either mobile-optimized or full-Web) is
not unambiguously determined from the URI pattern, the proxy
MUST take into account the original user-agent to avoid
unnecessary transformations."

Rationale: It is obviously pointless to transform full-Web
content accessed by full-Web capable devices (or vice-versa,
transforming mobile-optimized content for devices with mobile
browsers). Two cases illustrate the situation.
a) When full-Web devices such as advanced HTC PDAs, iPhones
or tablets access the Web, there is no guarantee that an
established server will include a no-transform directive; in
fact, it might explicitly leave it out to allow transformation
to cater for non-full-Web capable devices. Further, the
proposed heuristics will not work: the MIME types of returned
content will indicate full-Web content (e.g. text/html), as
well as the DOCTYPE (e.g. -//W3C//DTD HTML 4.01//EN).
b) When i-Mode terminal accessing i-Mode applications, there is
no guarantee that the corresponding servers return a no-transform
directive (since it is irrelevant for i-Mode applications).
Heuristics may not work either, since content is largely returned
as text/html, and without any DOCTYPE declaration.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2072
Commenter: Mark Nottingham <mnot@mnot.net> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

* Section 4.1.5 What is a "restructured desktop experience"?
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2054
Commenter: casays <casays@yahoo.com> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

I posted the following message in the WMLProgramming mailing list.
People have suggested that I publish it as a formal comment to the CTG
draft, so here it is, under the heading "Allowing modifications of the
HTTP header field user-agent: rationale missing".

Eduardo Casais
areppim AG
Bern Switzerland
------------------

I would like to review (a last time) an issue that reoccurs in all
discussions about transcoders.

> Changing User Agent or other headers is not prohibited by HTTP

The first thing to stress is that the user-agent is essential to drive
content selection and generation processes, both in the mobile Web and
in the desktop Web.

a) In the mobile Web, the user-agent is directly associated to the
actual device, and hence serves as a key to characteristics such as
screen dimensions, preferred content types, etc. The advent of
uaprof/ccpp was supposed to make this mapping unncessary, but it is
not the case: uaprof descriptions are often missing, point to invalid
URL, omit important information, or are just plain unreliable. Device
databases like WURFL, based on user-agent mappings, thus remain
indispensible.
b) In the desktop (non-mobile) Web, developers have long relied upon
the user-agent to identify the browsers issuing requests in order to
tailor content to their "quirks". This has been going on at least
since the times of the Netscape / IE wars.

Let us now examine the use cases of a mobile Web browser accessing the
Internet, and evaluate the relevance of the user-agent -- assuming
that transcoders systematically substitue the original value with a
new one.

1.a User-agent-switcher.
The Web server is able, based on the user-agent, to
provide a mobile-optimized or a full-web service.
It therefore needs the original user-agent; modifying
it is unhelpful.
2.a Mobile Web only.
2.a.1 Generic content.
The server returns generic mobile content, without
customizing it for any specific user-agent.
This kind of applications is rare, and often
corresponds to surviving examples of text-only
services developed for older PDA and WAP 1 devices.
Since the server does not use the original user
agent, modifying it is useless.
2.a.2 Mobile with default.
The server returns mobile-optimized content. When
not recognizing the user-agent, it returns a default,
best-effort representation, perhaps with a message
suggesting that the content is tailored for mobile
devices.
Since the server relies upon the user-agent, and is
able to return a default representation, modifying it
is unhelpful.
2.a.3 No default.
The server returns mobile-optimized content, but will
return an error (page with "unsupported browser" warning
or return code "request not acceptable") whenever it
does not recognize the user-agent. In this case, modifying
the original user-agent is most unhelpful, as it guarantees
that the server will not recognize it as a valid
mobile one. If the server does not, for whatever reason
(e.g. incomplete device database), recognize a mobile
user-agent, then there might be a case for modifying it
towards an acceptable mobile one -- but transcoders
precisely do the reverse: they change a mobile user-agent
to an exotic full-Web one. Hence, a modification of the
original user-agent is unhelpful all situations.
3.a Full Web only.
3.a.1 Generic content.
The Web server serves generic full-Web content, without
looking at the user-agent. In this case, modification of
the original user-agent is useless.
3.a.2 Tailored full-web with default.
The server returns full-web content customized for specific
full-web user-agents (e.g. IE 6.0, IE 7.0 and Firefox 2.0),
and serves a default representation, perhaps with a warning
("This site is best viewed with the following browsers:...")
for other user-agents. In this case, modification of the
original user-agent is either useless (in any case a default
representation will be returned), or unhelpful (the default
representation is probably better downgradable than one
specifically customized for a very specific full-Web browser).
3.a.3 Tailored with no default.
The server returns content tailored for specific full-Web
browsers, and an error for other unrecognized or unsupported
user-agents.
Here there is a case to substitute the original user-agent to
force retrieval of content. However, this works only if the
"fake" user-agent precisely corresponds to one of those
accepted by the server -- but transcoders do not tailor their
substitute user-agents with respect to the application server:
the include only general hints (like Mozilla/x.y) in the hope
this is enough to determine content generation.
Hence, the generic substitution of user-agents performed by
transcoders is not appropriate here.

Conclusion: in two cases, modification of the user-agent is useless,
in three it is detrimental, in one it is either useless or
detrimental, and in one it could be helpful, but it is currently done
inappropriately.

Let us consider the interesting symetric use-cases: a full-Web mobile
device accessing the Internet.

1.b User-Agent switcher
Following the same reasoning as in 1.a, we find that
modification of the original user-agent is unhelpful.
2.b Mobile Web only.
2.b.1 Generic content.
Whatever user-agent, the server returns generic mobile-optimized
content. A modification of the original user-agent is useless.
2.b.2 Tailored with default.
The server returns mobile-optimized content, and a default
representation for unrecognized user-agent. Modifying it is
therefore useless -- the default representation will be returned
whether the original (full-Web) or the substitute (pseudo-full
Web) agent appears in the request.
2.b.3 Tailored, no default.
Following the same reasoning as in 2.a.3, the substitution of
original (full-Web) user-agent by a fake (full-Web) one is
useless, as it will anyway return an error.
3.b Full Web only
3.b.1 Generic content.
If the server returns generic full-Web content whatever the user
agent, then modifications of the user-agent are useless.
3.b.2 Tailored with default.
Following the reasoning in 3.a.2, modifying the original
full-Web user agent is either unhelpful (because the server
could have recognized the mobile device's agent), or useless
(the same default representation would be returned).
3.b.3 Tailored, no default.
Here the server may recognize the full-Web user-agent of the
mobile device; it is therefore unhelpful to modify it. Or it
might not support that specific user-agent, in which case it
would be sensible to substitute one that is effectively
supported by the server; however, this is not what transcoders
do: they provide a generic, not a real user-agent instead --
this is inappropriate.

So one situation where it is detrimental, four where it is useless,
one where it is either useless or detrimental, and one where it is
either useless or inappropriately done. It is also an acid test: do
transcoders modify requests from full-Web capable mobile browsers? If
so, something seriously weird is going on, as the excuse has generally
been to make full-Web content available to non-full-Web capable devices.

>From this examination, one can only conclude that proponents of the
preservation of the original user-agent do not have to justify their
position and established practice. Rather, the onus is on the
proponents of the substitution of the user-agent to argue in favour of
their approach, which disrupts established practice. There is
basically only one use case where changing the mobile user agent to a
desktop user agent might help, but it remains to demonstrate:

a) The relevance of the scenario. Perhaps people at Google could let
one of their crawlers roam over a few tens of thousands of WWW sites
to gather statistics on the relative frequency of each aforementioned
scenario.
b) The benefits resulting from handling that specific scenario.
c) That (a) and (b), taken together, are so overwhelming that they
more than compensate the disruptions caused in all other use-cases.

If another use case outside the framework I have presented here pops
up, this does not reduce the need for an assessment based on (a), (b),
(c).

As a final remark, I would like to note that transcoders have been
operating in the mobile Web for a long time. It started with
adaptation of HTML for PDA (Web clipping) and HTML to HDML conversion,
continued with HTML to WML, before arriving at the current crop of
content adaptation. In the old times, developers of content adaptation
software were wary of modifying the user-agent: turning generic WWW
content into a form suitable for mobile devices is so fraught with
difficulties that one would take every chance to let a server return
mobile optimized content (based on the user-agent) if it could. It is
only fairly recently that, without much justification, transcoders
have started in a
systematic way to overwrite the user-agent field.

I think I have said everything I wanted regarding the CTG. The
document requires quite some rework -- nothing exceptional, since it
is a draft. I will lean back and wait for the results of this round of
revisions. Till then, readers of the WMLprogramming and W3C BPWG lists
can rejoice in the knowledge that my long-winded posts are abating at
last.


E. Casais
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2073
Commenter: Mark Nottingham <mnot@mnot.net> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

* Section 4.1.5 "proxies should use heuristics including comparisons
of domain name to assess whether resources form part of the same "Web
site." I don't think the W3C should be encouraging vendors to
implement yet more undefined heuristics for this task; there are
several approaches already in use (e.g., in cookies, HTTP, security
context, etc.); please pick one and refer to it specifically.
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

Comment LC-2005
Commenter: EdPimentl <edpimentl@gmail.com> (archived message)
Context: 4.1.5 Alteration of HTTP Header Values
assigned to Jo Rabin
Resolution status:

The styleguide should spell out very clearly "The Transcoder is NOT
allowed to change the User-Agent String".
(space separated ids)
(Please make sure the resolution is adapted for public consumption)

1-20 21-40 41-60 61-80 81-85

Add a comment.


Developed and maintained by Dominique Hazaël-Massieux (dom@w3.org).
$Id: Overview.php,v 1.46 2013-10-04 08:11:33 dom Exp $
Please send bug reports and request for enhancements to w3t-sys.org