Using History, Collaboration, and Transparency to Provide Security

With some digressions on usability and security of web service authentication in general

Mary Ellen Zurko, Dave Wilson, IBM Software Group

Position Paper for Toward a More Secure Web

W3C Workshop on Transparency and Usability of Web Authentication


As the commercial, economic, and financial attractiveness of the data and interactions that occur on the Web rises, so does the activity by persons and organizations who would take advantage of it for their gain, licit and illicit. We see the same phenomena with spam in email. The web and email infrastructures are global and open to (practically) anyone, requiring only either modest means or physical access to some populous center of activity. In the past, physical limitations, closed communities, and heterogeneity of software provided bottlenecks on the kinds of attacks and scams possible. These constraints do not exist in the web. While non-functional attributes such as performance and usability have substantially increased with recent web programming approaches such as AJAX, there are no practical programming techniques or patterns proposed to increase security on the web.
 
Pure authentication of web sites will not solve the problem of scam web sites. By analogy with spam again, there is some evidence that the early adopters of new email authentication technologies such as Sender Policy Framework (SPF) are spammers. SSL provides web site server authentication and is supported by all widely deployed web clients. Being able to authenticate in classic web and security parlance means almost nothing to the vast majority of users when the identity that is authenticated is meaningless to their task at hand. Domain names are not useful identities. Even users who can parse a web site’s domain name and compare it to what they expect can experience uncertainty or “false positive” security concerns in a world where even legitimate businesses use multiple domain names, and out sourcing will continue to occur. The current emphasis on web services will only increase out sourcing of business activities to web sites in another domain. Authentication of a computer identity can be useful in more constrained worlds where the association between computer identity and warm body is known (such as in an enterprise). This association allows retribution and recovery after the fact, targeted at the identifiable scofflaw. Obviously this is much more problematic on the Web.
 
How do users know if a site is trustworthy? Today, they use a wide range of indicators, as borne out in studies by researchers such as Andrew Patrick. They look at the professionalism of the design, and many other attributes that have nothing to do with security mechanisms. This happens in potential spam too. Even savvy computer professionals will scan the body of a spam email checking for signs in the content if something else does not seem right (DNS domains, for example). On top of that, different web sites are likely to be trusted for different information. Company confidential information is fine for any web site my company runs and owns. Phishing attacks currently target information that can be used to impersonate the user in a wide array of circumstances (name and social security number or credit card) or more limited circumstances (name and password to a financial web site). Targeted changes can protect limited circumstances (specific banks requiring or supporting two factor authentication), but do not solve the broader problems.
 
It’s unlikely any of these problems will be solved without deployment of additional client side software, in the form of updates to browsers, add ons to browsers, or other changes.
 
While, sadly, nothing in this world is absolute, there are four types of information that can indicate with some real world assurance how trustworthy a site is:
 
1.    The user’s personal history with the site,
2.    What others say about or have done with the site
3.    What a knowledgeable and responsible part says about the site
4.    The time line for any of the information about the trustworthiness of the site
 
The most obvious and easiest to integrate information is the user’s personal history with the site. Have they been to the DNS domain before? Did they generally use a secured protocol for the current URL, or for the URL they are requesting or posting to? Did they send identical data there before? Did they send anything there before? Obviously browsers can do a better job of checking and representing this information. Some begin to. A more detailed and stronger model of what is trustworthy, or conversely what should be flagged as untrustworthy or even unknown, is useful and attainable. Following the same link in a phishing email twice, or two different links in that phishing email, should not make a site appear to be a peer of sites the user has gone to and interacted with successfully in the past.
 
Once we have an established model for how a user’s history can help, the history of others can help establish the trustworthiness of a site as well. The user’s relationship to those others, and the reliability of that information, becomes an issue. In a constrained environment, such as an enterprise, there’s a ready made population of reliable colleagues whose history can be tapped. These sources are reliable because of the real world relationship between them and the user (employed by the same business, potentially direct and known colleagues if scoped to the org chart). This is reflected in the authentication and naming mechanisms in place in the organization. In the consumer and individual’s case, ensuring reliability is more complex. Approaches with potential include “friends and family” (what history do people in your address book have with them?) and a “wisdom of the crowds” (what history does the general populace have, and how does that compare with their history with supposedly similar institutions?). Willingness to share, safety of and controls on who can see what’s shared (people in your own address book, for example), and reliability of the information from a computer systems point of view (is it really from the people you think it’s from?) are all issues to be addressed.
 
Already in the definition of others we see that some “others” are more fit for the purpose, more trustworthy with respect to an individual, than others. Specifically knowledgeable and responsible parties, authorities, can provide accountability and specific and targeted guidance. While those are innate in enterprises, the history of authorities for recommendations on the web is not good. Neither PICS nor P3P authorities seem to have emerged. SSL authentication authorities for web sites have, but that seems to be part of the problem. Any web site can get an SSL certificate. It’s not clear users check for SSL or balk at a self signed certificate. And authentication of the DNS name has not proven to be very useful, since there is not hard link between institutions and their DNS name(s). In general, users don’t configure their SSL authorities (they come with the browser or are augmented when they say “OK” to annoying dialogs) or their P3P preferences (they too come with the browser).
 
Once a relationship is established, an institution could in theory act as an authority on its DNS names itself. This pushes a multitude of security and usability issues into one place, and also brings up the issue of spoofing at the UI level. The ability for scripting to overlay trustworthiness indicators is one of the many current problems. The traditional solution is to allocate secure processing areas that cannot be overwritten by untrusted processing (scripting). An alternative might be to use the natural processes of user customization and personalization. No one’s computer desktop looks the same, if they run in an environment that is highly personalizable and responsive to application customizations. A security indicator that was naturally easy to move around to an appropriate place and that provided feedback based on the user’s history would be difficult to over write accurately by scripts (assuming a scripting language with a security model that made sure the scripts could not query the placement and contents). Supporting scripting language with no security model makes the problem intractable, or forces it into the lower supporting layers.
 
To some extent, the time line of the history information is also potentially useful, in particular for damping down “flash crowds” from phishing scams. The longer a scam needs to go on, the more difficult it is to pull it off. This couples well with the transparency of sharing information about interacting with sites. The more visible the activity is, the more difficult it is to continue it if it’s a scam. The transparency provided by collaboration and sharing can counter balance the lack of authority and constraints that make these problems seem much harder to solve in the web. Privacy concerns are the obvious reason that it will be hard to make information available to others. Privacy preserving techniques (anonymizing) can help, but recent US government calls for records from search companies will make trust in any central point more difficult to come by. Authentication authorities such as OpenID may provide a third party who can report activity data.
 
While the recommendations detailed above require additional client side software they also require a corresponding change in client side behavior to pay attention to or take advantage of these new capabilities.  Another -- more active -- change in client behavior is to incorporate personalized experiences into the web server server authentication mechanisms.  Passmark (www.passmarksecurity.com) provides an authentication platform where the user personalizes their authentication experience with text phrases, challenge questions, and images.  When the user authenticates to a Passmark enabled system, they first enter just their user name, are brought to an area that they've personalized (via text phrases and image selection), and then enter their password from this interface that they've had a hand in creating.   Phishers cast with a large net, hoping to obtain data by presenting a standard interface to all users; adding personalization to the interface disrupts this model.

A lot of “bread and butter” work needs to go with whatever direction is adopted. Claims of usability need to be tested, using the recognized practices in the CHI community for doing user testing to validate usability claims (lab tests, field tests). See the “Usability Design and Evaluation for Privacy and Security Solutions” in Security and Usability: Designing Secure Systems That People Can Use. (In general this book is an excellent resource on the current state of the art and industry on the topic of security and usability.) Claims of secure usability need to be tested as well. That field is less mature than either pure usability or pure security testing alone. Security testing can also occur in the lab (red teaming) or in the field (ethical hacking). The challenge for usable security testing in the lab is to not create a situation where the user has heightened sensitivity to potential attacks, particularly since those come with far less regularity in the wild. The challenge for usable security testing in the field is setting up a situation that is safe and ethical for the user, again with the need for them to not know about attacks a priori.
 
The call for the workshop also mentions web client safety. The biggest challenge to safe Web client behavior is active content (such as Java, Javascript, and other scripting languages). Web clients should never support any code that comes from the web without a security model. Preferably a security model that involves digital signatures. There’s copious research and deployment experience on security models for active content, including our work in the context of Notes/Domino. It remains the hardest area of security to also make usable, since it’s innately about code, which is something most users should not need to understand.
 
While usable security as a field has been around three decades, there are still many challenges. Substantial work has gone into making user authentication more usable, though most is still in the research phase. Both expert reviews and testing techniques from both disciplines are being successfully applied to specific technologies. Principles are beginning to emerge. In the context of web site authentication, we think integrated and transparent security through history and collaboration state hold the best promise.