Email security and client development

Paper for the W3C HTML Mail Workshop, 24 May 2007, Paris, France

Author

Chris Newman

This is my personal opinion as a long-time participant in the email industry.

Paper

Imagine you had a web browser which every day would randomly jump to a web page created by a phishing house (a different one each day) that was trying to break into your computer. This browser would proceed to execute all javascript, flash, and other active content on that web page? If you were lucky you would only get a few pieces of new spyware and adware each day, making your computer unusable within a week regardless of the quality of your anti-virus or other protection software. Then imagine this web browser had unrestricted access to your personal address book and all your business contacts and could trivially use that information to generate emails to those contacts on your behalf? How would that impact your business relationships? Would you run this web browser if you had the option to run one that didn’t do this?

Next, imagine the second you view one of these phishing web page, the web page owner immediately knows that you actively use your web browser (and when you use it) and can arrange to have your web browser visit their page every day in addition to the other things it does.

That’s what rich HTML email would be like and why email vendors work hard to restrict what forms of HTML are permissible (and is one of many reasons why technology-aware users prefer plain text email). Because email can be sent to an unsolicited recipient, rendering rich email safely is a much greater security and privacy risk for the recipient than a typical web page. While there was some early research on active content in email that could be used safely (safe-TCL), it was never clear that the benefits would outweigh the risks.

Attempts to “filter” HTML, CSS and javascript to remove known bad things are not viable long term. There’s always another extension by some particular HTML engine that allows risky content embedded in some new part of the language. What’s needed is an extensible white-list of HTML and CSS features that is typically safe for use in email. For an example of such technology, see HTML Sanitization

If the W3C were to take the lead on maintaining and standardizing an industry-standard default HTML/CSS whitelist for risky content environments it’s likely to (eventually) raise the baseline level of HTML/CSS support in email. Furthermore, an investigation of why privacy-preserving technology (such as MIME HTML RFC 2557) to embed the complete HTML document and all ancillary data in the email so the client can operate in a “safe no network” privacy mode has failed to deploy well would be interesting.

The current email client market is in a dire state from the viewpoint of innovation. The vast majority of mail clients generate no revenue for the vendors of those clients and as a result there is extremely limited investment in mail client technology. Two of the leading innovative cross-platform fat client projects (Mulberry and Eudora) were recently terminated (the former due to chapter 7 bankruptcy, the latter is merging with Thunderbird). Is there a way that the companies that want more features in email clients could invest money to restart innovation in the mail client industry and possibly make it profitable for independent mail client vendors so there is innovation? Is other standards work needed to improve the situation? Can innovation be driven by clients on portable devices and then move back to the more traditional client environments? What sort of rich client features would benefit mail users so much that it would start a mass migration?