[Paper Overview] [DRM-Workshop Homepage]
Poorvi Vora < email@example.com>,
Dave Reynolds < firstname.lastname@example.org>,
Ian Dickinson < Ian_J_Dickinson@hplb.hpl.hp.com>,
John Erickson < email@example.com>,
Dave Banks < firstname.lastname@example.org>
Publishing Systems and Solutions Lab.,
Electronic commerce and the Internet are changing the way information about customers is gathered and used. Unfortunately, most of the changes have resulted in the reduction of consumer privacy. The ease of processing, obtaining and transmitting information has made easier both trading in data as well as collating information from different sources, and information about individuals is often collected and sold without their knowledge/consent. The e ase of breaking into data stores and wiretapping has reduced the security of stored and transmitted information. Transfer of data from one location to another with different laws complicates the privacy problem further. There is an increasing awareness among consumers about privacy violations, and, with it, an increasing resistance to go along with the privacy dilution. This is making the legal position of the data collector extremely fragile.
The potential of e-commerce in digital assets makes the privacy problem even more acute. Electronic tracking and user authentication make the gathering of extremely granular, personally-identifiable digital asset usage information a simple task, and increase the legal liability of the data collector. In particular, those who benefit from the collection of this information, as well as those who depend on the collection of this information to prevent contract circumvention and thus determine fraudulent use of digital assets, are vulnerable. It is not necessary to compromise consumer privacy to prevent fraud, and, to be successful, DRM systems and frameworks should not assume it is.
The class action suit against Real Networks because the Real Media Jukebox has unique identifiers for each installation and corresponding tracking potential , and the tremendous negative publicity due to Intel for unique numerical identities to individual Pentium processors , are simply first examples of the impact of privacy concerns. A number of the attempts to break the security of rights enforcement systems were initiated because of growing public awareness of being `watched' by these systems. To be freer from legal liability, and to be successful among consumers whose privacy awareness is growing dramatically, DRM systems need to protect the rights of consumers along with those of content providers. The very technology used to protect content provider rights can, and should, be used symmetrically to protect consumer privacy.
Privacy concerns are extremely important for W3C. P3P is beginning to gain positive publicity, and Microsoft , has committed to implementing tools for setting P3P preferences in the next versions of its browser. A DRM standards effort from W3C that does not address privacy will have two-fold negative impact: it will handicap the DRM standard itself, and dilute the credibility of P3P. On the other hand, including privacy in a DRM effort will enhance the case of both P3P and the DRM standard.
Current rights management systems focus on the rights of the content provider, who is, from this point of view, the only first-class participant in the systems. Privacy protection schemes exist that would enable the protection of consumer rights while allowing also the protection of content provider rights. We propose that the W3C provide a rights management framework that is inclusive of these technologies and thus includes the consumer as a first class participant. Details of what this means follow. Section 2 of this paper addresses specific privacy infringement possibilities in DRM systems and ways of addressing these. Section 3 briefly mentions existing privacy technologies that address some of the privacy issues mentioned in section 2, and section 4 describes a couple of example outcomes of a W3C DRM standard that would address privacy.
A system that treats the consumer as a first-class participant is defined as one in which:
The rest of this section elaborates on specific implications of the above more abstract description of what is meant by first-class participation.
There are two essential steps in current rights management systems that violate the privacy of the consumer, or, in b2b situations, the commercial buyer. The first is the consumer/buyer authentication step. This step establishes who the buyer is, and also establishes a unique identifier for the buyer. The unique identifier can thereafter be used to collate information about the buyer obtained from the current transaction with all kinds of other information divulged by the buyer using the same identifier. The very requirement of this step prevents the possibility of anonymous browsing . The second step that violates privacy is the tracking step. The amount and quality of tracking information that can be generated for digital media differs by many orders of magnitude from that generated for physical media, and it can be very granular and accurate. A usage log for a single user can itself be a fairly valuable digital asset, often more valuable than the asset whose use it logs.
The justification provided for user authentication and tracking is that they form the fraud prevention mechanism of current rights management systems. If a user identifies herself and agrees to a contract, she can later be sued if tracking indicates she has violated the contract. While this is true, it is not the only way in which fraud can be prevented, and fraud need not be highly prevalent in systems with more privacy. The literature on electronic cash is rife with ways of preventing fraud while retaining degrees of anonymity - for a great overview and critical review that separates the anonymous from the not-so-anonymous, see .
There is no doubt that both user identification and the generation of user profiles can provide tremendous value, other than fraud prevention, to both the consumer and the content provider. For example, the detailed information can be used in pay-per-view business models; it can be fed back into pricing models; it can be used for highly directed marketing; and can also be used for efficient classification and associated search and retrieval, providing dramatic benefits to both the consumers and the sellers of media assets and associated services. Tracking of digital media is also useful in a closed digital media publishing system (like a commercial printing workflow) where the players may be assumed to be trusted and payments are made based on the amount of usage of individual assets. In highly trusted, closed systems, this might be the only expression of rights management.
The value of tracking and user identification is considerably diluted, however, when the consumer is not allowed to participate in the determination of the degree of tracking, and when he is not allowed to control the degree of anonymity allowed in the system. While we do not propose allowing only the consumer to determine these, they should not be established solely by the needs and assumptions of the content provider as they are today.
The focus of DRM systems needs to change to include the consumer as a first-class participant. This implies the following:
As in rights management, privacy technology can be thought of as (policy/contract) expression technology and (policy/contract) compliance technology. While W3C may not be an appropriate body for the details of compliance technology, it has an impressive history in expression technology for other applications (P3P, RDF, XML, HTML).
There are some aspects of compliance technology that cannot be ignored, however, because they are interwoven very finely with rights management protocols. A good example is anonymity. A rights management system can be built on the assumption that each user will demonstrate their public key, which is usually closely linked with personal identity, or it can be built on the assumption that a user will demonstrate the minimum information required to prevent fraud. The latter allows the user to then add on any other information as a bonus in return for a discount, perhaps, from the seller. It also enables the use of the protocol by those users who wish to retain more anonymity.
Anonymity may be thought of as protection of the unique identifier associated with a user. Different degrees of anonymity are required for different applications, and by different users. It is important to allow varying degrees of anonymity. Example existing schemes are:
This kind of anonymity implies the use of a trusted screening party as a mediator. The trusted party strips information passing through it of any identifiers that can be used by outsiders. It is not very strong anonymity because all the information is available to the third party. The third party may encrypt the information with the user’s public key so that only the user may access it thereafter, thus preventing even the third party itself from accessing the data. Even so, the third party knows that information was generated, when it was generated, and between what two parties. This kind of anonymity is broken if the third party reneges on the understanding that the information held is private, and shares/sells the unencrypted information.
This kind of anonymity is slightly stronger than screening and can be used in association with it. It prevents privacy violation by not allowing the composition of data from different sources/sessions to compile a composite personality. A user maintains a number of keys instead of simply one key and uses different keys for different transactions/merchants/sessions. There is no one unique identifier associated with the user. Hence, for example, the user's profile with amazon.com cannot be merged with his profile at hp.com, preventing complete identities from being developed by collaboration among merchants. At the same time, this scheme allows the user to maintain a profile with an individual merchant - the profile itself can be very beneficial to the user because it helps in the generation of targeted marketing that can be very consonant with the user's tastes.
NymIP  is a proposal for a standard Internet Protocol using nyms.
Stefan Brands' of Zero Knowledge Systems has come up with a number of schemes that provide remarkably strong anonymity while helping prevent fraud. These schemes build on earlier exceptional work by David Chaum . A very good review of these and other schemes may be found in . The essential idea of these schemes is to enable the use of tokens that contain the information required to carry out a transaction. These tokens may be electronic cash tokens, symbolizing a certain amount of money, or vouchers such as those required for rights management transactions. The schemes provide protocols to prove possession and honest use of the token without requiring the disclosure of additional information, and enable simultaneous anonymity and fraud prevention in degrees not possible hitherto. The current public key based rights management protocols are special cases of these protocols, but make assumptions not made by these protocols and hence cannot incorporate them. We will refer to these protocols as Proof of Knowledge (POK) protocols in this paper. We use the term for both the most general case that subsumes all others including protocols based on PKI and SPKI (Simple Public Key Infrastructure), as well as for the specific strong anonymity technologies built around the protocols of . It will be clear from the context which of these we mean.
The anonymity technologies begin to enable different ways of looking at identity, including, specifically, the strong connections between personal profile revelation and proving identity. Parts of the personal profile are revealed to identify one's self. Because the personal profile is an asset, these parts are revealed carefully and the degree of revelation is explicit.
In addition to the anonymity technologies of various degrees, usage information can be made available at different levels of granularity by the asset viewer. A standard vocabulary for the degree of granularity is needed. In its absence, one can think of the granularity as being descriptive metadata about a personal profile asset.
Finally, tracking information also needs to be expressed, and vocabulary for this also doesn't exist in very standard form.
To illustrate the use of the ideas discussed earlier in the paper, we present a couple of example outcomes.
User Authentication with
Degrees and types of anonymity , for example:
Anonymized through trusted third party
Choice of when to reveal
Usage Tracking with
Extent of tracking (what is being tracked?)
Controlled revelation of usage data
Rights clearing with
degree of usage and rights information staying with client vs. rights clearing agency (how much of the tracking information is sent back to the clearing agency and at what level of aggregation)
how often rights clearing agency is contacted wrt asset access
granularity of divulged usage logs
The main HP position paper by John Erickson et al  proposes a Policy and Rights Expression Platform (PREP) that provides `a model defining open interfaces between three architectural levels of abstraction: rights expression languages, rights messaging protocols and mechanisms for policy enforcement and compliance' [section 4, 9]. In this section we propose aspects of these levels of abstraction that would be useful from the point of view of privacy.
A personal profile would be an asset in the system, with ownership, access rights and descriptive as well as rights metadata associated with it.
Rights Expression Languages: A semantic layer that all rights expression languages can be translated into should address the needs of privacy vocabularies and syntaxes including vocabularies for profile description (vocabulary not for the profile itself but for metadata about the profile such as the level of granularity), access rights to profiles (example P3P, XrML), degrees of anonymity, and degrees of tracking. As far as possible, this layer should not divide profiles and media assets into two groups, and should instead enable possible combinations of these into composite documents.
Rights Messaging Protocol: The rights messaging protocol should not require user identification with a key as from the traditional PKI. It should, instead, allow a choice of identification from the choices of 4.1 and, in general, allow for POK with or without a third party mediator (which generalizes all the options of 4.1). This is consistent with the notion mentioned in section 3.1 that detailed identity is synonymous with personal profile, which is an asset and hence revealed carefully and explicitly and not frivolously.
Mechanisms for Policy Enforcement: The bindings mentioned in  between elements of the language layer and compliance mechanisms should not depend on traditional identity, but on POKs. Further, these bindings should enable privacy compliance for personal profiles. Authorization tokens for rights enforcement at the server end should, again, use POK instead of traditional identity. Secure containers should not assume only media assets (where there is a minimum granularity for access - in an electronic book, for example, a paragraph could be the smallest allowed unit for granular specification of rights) but also personal profiles (where granularity is a very different issue).
We have described the privacy invasions possible in rights management systems and explained why a W3C DRM proposal ought to avoid these. We propose that, instead, a W3C DRM standard treat the consumer as a first-class participant along with the content provider, in a symmetric system which treats consumer identity and personal profiles as assets. This would protect content providers from the legal liabilities of privacy invasion, promote success among consumers whose privacy awareness increases by the day, and enhance the credibility of P3P. We have surveyed a number of privacy protecting technologies and point out that it is not necessary to disregard privacy for fraud prevention. While there are important benefits to both consumer and content provider of profile and identity revelation, these revelations should not be assumed, and there should be an explicit mechanism for these revelations. Further, the consumer should have control over what is acceptable. We provide example outcomes of the workshop that would be in keeping with this vision, including specific suggestions with respect to the PREP framework proposed in .
"RealNetworks in Real Trouble ", Wired News Report, 9:15 a.m. Nov. 10, 1999 PST
 Declan McCullagh, "Intel Nixes Chip-Tracking ID", Wired New report, 3:00 a.m. Apr. 27, 2000 PDT
 "SafeNet 2000: Security and Privacy Leaders Gather at Microsoft Campus to Seek Solutions to Challenges Facing Internet Users", Redmond, Wash., Dec. 7, 2000
 Julie E. Cohen, “A Right to Read Anonymously: A Closer Look at ‘Copyright Management’ in Cyberspace,” 28 Conn. L. Rev. 981 (1996).
 Stefan Brands, Rethinking Public Key Infrastructures and Digital Certificates; Building in Privacy, August 2000, MIT Press
 David Chaum," Untraceable Electronic Mail, Return Addresses, and Digital Pseudonyms", Communications of the ACM, February 1981, Volume 24, Number 2
 NymIP Research Group, "The NymIP Effort", announced at the 49th meeting of the IETF, Dec 2000.
 David Chaum, "Achieving Electronic Privacy", Scientific American, August 1992, p. 96-101.
 John Erickson, Matt Williamson, Dave Reynolds, Poorvi Vora, Peter Rodgers, " Principles for Standardization and Interoperability in Web-based Digital Rights Management", A Position Paper for the W3C Workshop on Digital Rights Management (January 2001)