ChangeProposals/removesrcdoc

From HTML WG Wiki
< ChangeProposals
Revision as of 21:23, 31 March 2010 by Mturvey (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Remove the srcdoc attribute

Summary

Remove the srcdoc attribute.

Rationale

The original bug report for removing srcdoc provided the following change request[1]:

    This recent entry does not have universal acceptance, and the group was still
    discussing it when the editor added it to the specification. 


    The supposed use case for this attribute is weblog comments, but concerns about
    HTML security have been resolved with weblog and other application comments
    years ago. In addition, support for this attribute could give the impression
    that online sites don't need any other security, which is false. Script
    injection is only one aspect of security related to weblog comments, and
    considered a fairly trivial one at that.


    This needs to be removed from the specification.

The rationale given by the HTML5 Editor for keeping this attribute:


   Rationale: I'm happy to remove this attribute from the W3C HTML5 specification
   if that's what the working group wants. The last time I removed a feature based
   on a bug report such as this, I started a minor war, however, so I suggest that
   you raise this via the change proposal process if you really feel this way.


According to the HTML5 editor, there is no rationale for keeping this attribute. That made this change proposal more difficult to write, because I had to base my arguments on guesses and scraped email messages.

There was a great deal of contention about this attribute before it was added. It spawned another issue (Issue 103) because of concerns about escaping the markup in the attribute, especially for XHTML. That this caused some difficulty for members of this group, who are defining the next version of HTML/XHTML, should give us pause, because knowing what must be escaped is going to be that much more difficult to the average web developer[2][3].

When asked the purpose for srcdoc, the HTML5 Editor replied that the use case for the attribute is weblog comments[4]. Because the srcdoc attribute works within a sandboxed context, the use of the attribute would prevent script injection in comments. Since this change was targeted to a specific use related to weblog software, I asked Matt Mullenweg[5], the creator of WordPress, one of the more popular weblogging tools in use today, about the usefulness of this attribute. He responded with[6]:

    We haven't had any HTML-level problems in comments in a while.


    We use and maintain a library called KSES that we use for all 
    sanitation, and it has served us well.

I brought Matt into the discussion for two reasons. The first is that I wanted to bring in an "implementor", and demonstrate that an implementor, in the case of weblog comments, is the the group or individual responsible for the weblogging software. Too often this group is focused purely on browser developers as implementors, forgetting that browsers are not the only application group impacted by HTML5 changes.

The second reason was to demonstrate that no one from the weblogging community has asked for this, and it is very unlikely that many, if not most, of the weblogging community will use this uncomfortable, awkward attribute. The weblogging community has long had to deal with security problems, and has devised sophisticated tools and techniques to not only protect against script injection, but also SQL injection, the greater hazard for weblog comments, and even the accidental wayward insertion of a non-printing character in XHTML.

In point of fact, relying on something such as srcdoc can make a site less secure rather than more, because it only touches on one vulnerability, when we're faced daily with a host of new and ever more sophisticated threats[7].

So the use case is heavily flawed. What are the other issues associated with srcdoc? I've already mentioned the concerns about escaped characters, and how this will differ between HTML and XHTML, which in itself will discourage its use with most applications like Content Management Systems. Are there other issues?

Another issue is when something like srcdoc can be used, and if the restrictions of the use are such as to defeat its use. This attribute can't be used effectively for potentially years in the future, because web browsers don't print out what's contained in the attributes—not unless specifically directed to do so[8]. Until then, the fallback is used, which is the iframe's src attribute. In the meantime, our existing applications that do provide security become more sophisticated, more capable, more tightly integrated, until by the time we could use srcdoc effectively, few of us will even remember what it is, and fewer still, would be interested.

An alternative to srcdoc was suggested in the discussion surrounding this attribute. Instead of embedding markup in the attribute—something that has been actively discouraged for some time— we can use a data URI with the src attribute, getting the same functionality that can be more quickly usable and won't require us to embed markup in an attribute. However, the data URI has its own challenges, specifically the fact that the data would be printed out without the security controls in legacy browsers [9]. Again, though, using a data URI in an iframe src attribute would most likely never be used for weblog comments. I find it unlikely that any approach related to the iframe and sandboxing will ever be used with weblog comments, so it might be best if another use case is used to attempt to defend this attribute.

One use case that does come to mind are the plug-ins we drop into our web pages. The source of the plug-in comes from an external site, which could be cause for alarm. However, plug-in security is not related to the srcdoc attribute, so I have a hard time determining what use case would apply. Perhaps there are none, in which case, there's even more of a reason to remove this potentially harmful, most definitely problematic attribute.


Details

Remove all references to the srcdoc attribute from the HTML5 specification. If such a removal results in a gap in coverage, consider following one of two paths: remove whatever other material is necessary to eliminate the gap or work with the W3C HTML WG to come up with an alternative approach, if one can be found.

I would also strongly suggest finding another use case, if you want to pursue this type of functionality.

Impact

Positive Effects

Removes a confusing, potentially harmful, and not really usable attribute, either forcing us to re-address the issue, or to consider dropping this particular subset of web page security from the HTML5 specification. Perhaps there are some aspects of the web that cannot be managed by browsers.

Negative Effects

Requires some of the Editor's time to make the change. Could potentially leave a gap in coverage, if this subset of security is still of interest, and would require more work in the HTML WG. However, counter proposals to this proposal might be able to provide effective alternatives. Or not, if none really exists. I don't believe any exists, which is why I'm not proposing an alternative.

Conformance Classes Changes

none

Risks

none

References

[1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=8818

[2] http://www.w3.org/html/wg/tracker/issues/103

[3] http://lists.w3.org/Archives/Public/public-html/2010Mar/0431.html

[4] http://lists.w3.org/Archives/Public/public-html/2010Jan/1193.html

[5] http://lists.w3.org/Archives/Public/public-html/2010Jan/1223.html

[6] http://lists.w3.org/Archives/Public/public-html/2010Jan/1337.html

[7] http://lists.w3.org/Archives/Public/public-html/2010Jan/1318.html

[8] http://lists.w3.org/Archives/Public/public-html/2010Jan/1325.html

[9] http://lists.w3.org/Archives/Public/public-html/2010Jan/1346.html