[Minutes Overview] [Workshop Home] [Previous:Publishers Requirements] [Next:Architecture: Interoperability and Standards]

Minutes from the Architecture/Infrastructure Session

Please refer to the position-papers and slides for authoritative answers. The following minutes are only a snapshot of Presentations and Discussions

INDECS Framework Data Definitions
Godfrey Rust (Indecs Project)
URI´s and Object Identifiers
Dan Connolly (W3C)
Principles for Standardization and Interoperability in Web-based Digital Rights Management
John Erickson (Hewlett-Packard)
Open Digital Rights Management
Renato Iannella (IPR Systems)
Digital Object Identifier
Norman Paskin (Int. DOI Foundation)
Discussion

Godfrey Rust (<indecs> Project), INDECS Framework Data Definitions

See also the [Slides (ppt)] and the <indecs> Framework

The <indecs> project ended 2000, now we are new company called <indecs> framework. Look at our online document: Principles, model, and data dictionary, June 2000.

We see DRM in terms of metadata, as a metadata problem. The description is covered in Open eBook, ONIX, and my company.

Here is the scheme:

people make stuff
people use stuff
people do deals about stuff

The scope is stuff. This can be characterized in terms of:

Parties
Creations
Agreements

Rust projected a diagram showing the high granularity in the <indecs> model, with hierarchy of parties and agreements. He noted that you must pass along the metadata in a structured defined way to permit computational processes.

The following things are required:

functional granularity: you must be able to identify stuff at any level of granularity
unique id
who says so - designated authority
appropriate access (who can do what)

In the 1980s, there were few schemes for description, today there are many, lists ten major ones, including MPEG-7, ONIX, SMPTE, RIAA/IFPI, and several more.

Here are the <indecs> principles:

All metadata is just a view (example: about the work versus about the manifestation, and more, each of which may have its own rights)
- views must not be confused; mistaken identity can be disastrous to rights management
- views need to be interoperable
Almost all terms need identifiers
- values must be defined and identified
- need standard vocabularies and ontologies.
- automation need for disambiguity. There is an existing vocabulary for some things: territories, language, currency, date/time and some others. But we need dozens of others.
Events are key to interoperability
- most metadata is stuff or people based
- events description are key to rights management

Godfrey Rust gave an example how this would work:

make event the first class object, e.g., Rust creating these slides
then all the other elements are attributes: author, date, title, etc.
next event, e.g., Rust showing this Slide -- it has attributes too and references the previous event, thus connecting the creative items
next event: Norman Paskin adapts (transforming event) these slides before he shows it at another meeting, thus creating new attributes and references to preceding events, thus connecting the creative items

This model has the same information as other metadata structures, just organized differently to serve rights management.

Here's another event that bears on rights: agreeing. What goes into the agreement is what goes into descriptive metadata, what he had, what he did. Also assertion by a trusted entity that verifies or authenticates.

Using the event structure, now have six events regarding these slide-show. See how you can use events to integrate descriptive and rights metadata.

But we need rights vocabularies to make this work, on a parallel with the need for vocabularies to serve descriptive metadata.

Dan Connolly [W3C],W3C URI Design Principles

See also the corresponding Activity within W3C and the slides

He outlined W3C's Philosophy of Standards: Help people do the right thing.

URIs will have a relationship to a potential DRM-Activity. Connolly also suggests that DRM discussion focus on payments and rights negotiation as much as prevention of access.

All names are ultimately local. Global naming depends on social agreements and trust. HTTP is not the only protocol. For the Web, we use DNS (Domain Name System). Don't forget, HTML is not the only file type. Things can evolve, you can use proxies and thus use an old name against new protocol. But URIs are the only thing in that arena. New protocols can be used with existing names. There is no need to change names just because you change protocols. We don't need to make new URI schemes just because we have made a new data format.

There is opacity: Don't peek inside names, names (URIs) are not user interface. Don't reinvent redirection in http, it is not a service, they are not locators. DNS supports multiple A records. Lots of administrative hierarchies fit in current http and DNS, you don't need to invent a new URI scheme.

URIs were not designed as a user interface. Don't use a URN as a brand name. To establish a new trusted brand name, trying to wedge them into DNS is a problem, you're ought to use <title>

John Erickson (HP), Principles for Standardization and Interoperability in Web-based Digital Rights Management

See also the slides and the Position Paper

John Erickson started by re-inforcing what was already heard during previous sessions: When we think about DRM, we have to separate expression of rights information and policies from the enforcement of those rights. We have to think of a layered model, separate the expression of rights information from the info for discovery and from implementing and enforcing those rights.

What is the W3C's role here? We think W3C should recommend a platform. Erickson put emphasis on the development of a language and a protocol for IPR policy expression, discovery, and interpretation.

The W3C should not recommend a standard DRM system. But we should provide a basis for the interoperability of such systems. Core should be to find a reliable way to express and transfer rights information. Remember the design principles of the web, IPR work ought not violate them.

Erickson developed the following set of requirements:

never interfere with users' ability to discover info (incl rights info) on the web, this is what I mean by universal access, so I can decide about whether to access
always communicate the policies and technical restraints in understandable language
policies are communicated in fair and open ways
need for trust, need to have a basis to trust the assertions being made, need a mechanism to assure trustworthiness
IPR information and policies must be discoverable and minimally interpretable in dependent of any given vendor's solution
the languages and protocols must be designed for evolution
web based mechanism must allow for owners to choose different tools and consumers to use different tools to discover and interpret rights info
cool new content that comes along ought not break the DRM systems or break the languages and protocols

Here's our [HP publishing group] proposal: PREP (Policy and Rights Expression Platform, see the Position Paper for more information). It would be a framework to express and interpret the policies and info. It should complement laws and self-regulatory programs. It should be consistent with prior work, e.g., P3P.

What are the building blocks of PREP?

semantics - policy interpretation mechanisms
objects - rights messaging protocol
syntax - rights expression languages

Erickson concluded:

W3C should recommend a platform for IPR policy expression, discovery, and interpretation.
W3C should not recommend a standardized digital rights management system.
Core should be reliable way to express and transfer rights information.

Renato Iannella (IPR Systems), Open Digital Rights Management

See also the slides and the Position Paper

We need to define DRM formally. Customary DRM definitions tend to emphasize protection, enforcement, security. We need to remove the security/locking focus of DRM. There are a lot of definitions out there. But we want DRM to be broader: describe, identity, trade, monitor, track, and manage rights holder relationships. We want to leave behind the "creation waterfall" concept of create => trade => use as a line. We want to look at a circular life cycle approach (compare Rust events). We use the same terms: create-trade-use and add reuse = recreate. Then the life cycle is more accurately seen as circular. He gave an example of transparency in presenting rights, from Adobe eBook. You are authorized to get the copy. It is OK to copy (up to 10 times each week), to print, to lend. But you cannot give or read aloud. He suggested, that it should be possible to pay for usage, not possession.

The following are building blocks for a DRM architecture:

A better metadata framework: He would like it to be in RDF
Trust (digital signatures)
There are lessons to learn from P3P, CC/PP
identification (URIs)
XML packaging and tools
use the <indecs> model: content, parties, rights. Each one is a kind of class. Parties described would be: author, corporate, agent, publisher. The content would be classified in work, expression, manifestation, item. Rights would be classified in usages, rewards, constraints.

Renato Iannella presented the ODRL. He sees it as a starting point for a DRM Language, that could be developped within W3C.

Concludes: W3C Role could be:

W3C Digital Rights Language Working Group to develop semantaics of a digital rights language encoded in XML
Trusted Metadata Working Group to develop architecture to support encoding and transmission of DRM and other metadata
DRM Interest Group to discuss next steps and establish relationship with other communities

W3C can solve some part of the DRM problem, coordinate others, and empower the user community.

Norman Paskin (Int. DOI Foundation), Digital Object Identifier (DOI)

See also the slides (ppt) and the Position Paper

Paskin was presenting the activity of the DOI Foundation. We have spent three years developing an identifier system for digital objects. We have been influenced by <indecs> analysis and implementations, e.g., ONIX and by consideration of digital object infrastructure (e.g., CNRI work).

DRM must be maximally extensible. DRM is digital management of rights, not just management of digital rights Practical rights management will require dealing with both digital and non-digital rights. Unique identification is essential for automation to work on this.

Description info and rights info are not distinguishable. Any piece of description may be needed in a rights transaction.

Creative items used to be physical, today we have both a physical and digital manifestation, so sometimes there are two identifiers, e.g., ISBN for one, URL for the other. But if we are going to automate transactions we must dis-ambiguate meanings. We need to define word like book in the spaces it may found in, the ISBN space or the <indecs> space.

There will not be one model for applying identifiers, it will differ for content communities, given practical implications, e.g. ONIX, MPEG-7, etc. A work may be an original manuscript version, the work in the abstract, a draft, a copy in a publication, a digital copy not in a publication, a reprint, etc.. In each role, there will be different ids and attributes. We don't have to have complete knowledge representation. We can build on agreements over what an identifier means within a given namespace

About names and locations, he said that a name is a location in a defined namespace, thus all names are locations is trivially true.

As practical needs for DOI, Paskin identified:

multiple instances
persistence in face of change
mgt of non-digital entities
de-referencing, resolution

Who should be responsible for naming: Standards bodies, rights collectives? Examples are:

EAN/UPC bar code system
ISBN system
URI system But what in the digital realm?
URLs are a poor system for publishers

Identifier needs to be actionable. They can be the basis for rights management. But there won't be one place to go for:

e.g., directory of parties (names of people, sort of, as for music, is developing a directory)
e.g., ontology of scientific article

We need to involve stakeholders, what is the W3C good for here?

Paskin used an aphorism: I think what is called media convergence really is "people convergence" (with the correlative problem of communication). Formalisms are essential in their place but must be explained. What we ought to care about does not just encompass the web.

DOI system offers :

numbering - use any identifier
description - can use <indecs> framework
action - handles allow to link to instances
It is persistent, granular, flexible, can wrap other identifiers

Discussion

Not for all questions and answers, the author was identified. In this case, you'll see only question and answer

Question: is URI primarily address where you find something?

Dan Connolly (W3C): I don't think so.

Danny Weitzner (W3C): Question to Godfrey Rust and John Erickson: Is there a consensus point? John said there should be a rights management schema. How modular is the <indecs> system ? Does anyone who uses something that falls under the <indecs> model have to use the whole model ?

Godfrey Rust (<indecs>): It is a matter of how you structure your information. If we use the event model to organize our system, this should lead to interoperability. If one wants to express things in most efficient way, an events - systems will be very powerful. Other systems/legacy info can be transformed into events model One doesn't have to organize all data into high level of functionality. We can still use information that is fairly low grade.

John Erickson (HP): An event model is powerful, because it allows description of certain rights relationships that we might think of in terms of electronic contracts. If we speak about rights languages, we can imagine a lot of different types of transactions. There are things that need to be declared, between an author and a publisher. It is a dynamic activity with lots of outcomes that the event model can characterize. The event model is the basis of an ontology. We need different vocabularies for different purposes. A contract between an author and a publisher is like dynamic state machine. The event model is a powerful way to express that. Things like rights vouchers/licenses and output of individual states are dependant on dynamic events.

Maximilian Herberger (Uni Saarland): Events are only one side. The event model reflects the state of subjective rights within a specific contract. We also need a way to describe how things fit together or what things have in common. We need a language at a higher level about objective rights. This language should be able to express classes of contract relationships. I think the combination of the two is the solution here.

Jonathan Schull (Digital Goods): Suppose I'm a publisher and I want to publish a book electronically. There are a whole lot of ways to do this. Each combination will have a different address, thus creating a different digital object. A publisher doesn't really care about the locations, he only cares about initial work. He may decide that people shouldn't print it, or that people want their money back. As a publisher, I want to have only one thing to do.

Jonathan D. Hahn (Versaware): In a contrarian mode I say, well, about these "events," you know publishers may think of only one event, the one that came into play when I signed on to publish this work.

Godfrey Rust (<indecs>): A DOI is a single number for the object

Dan Connolly (W3C): What's the first letter of most DOIs? Names are little pieces of communication. Making up a name without thinking about communication is sort of silly. We don't decide anything by ourselves, we decide together with people we communicate with. I don't think you can invent new technology that solves all the social problems involved

Norman Paskin (DOI): The web is not the universal information space. There are things, which aren't on the web. We need to identify them too.

Eric Miller (OCLC): you can place something on the web without using DOI. I represent libraries, so if you publish, you want consumer to get access to stuff - you have to talk to your customer base to find out whether they will use access mechanism you are designing to access content. What are the things we are trying to automate here? Let's think about a scenario with implemented DRM, let's do what-if scenarios.

Question: Why are ontologies so important?

Answer: Take a look at ontologies and what publishers require and you have your problem defined

Eric Miller (OCLC): A question not yet resolved is, what happens in DRM if different kinds of people are accessing the same object, e.g. in the context of a library.

Robert Bollick, (McGraw-Hill): This is already covered. It is like every consumer/publisher interaction which is covered by the requirements. More information can be found on publishers.org. The name of the document is publishers DRM requirements.

Comment: We will get a lot of input/requirements from many different constituencies, e.g. record industry, book industry etc.. We need to consider the evolution of technology - technical components should be able to move indepently from one another. We have a conceptual model: Take a language and a context used by different areas and avoid using two different terms for same requirement. We need to build a common platform. That gives you a mechanism to do extensions that are truly unique. Medicine, oil-drilling all use different terminology, but a conceptual model helps us to decide whether their need is unique and helps us develop an orthogonal system, and avoid redundancy

Scott Foshee, (Adobe) states agreement with the interest in fine-grained identification implied by the <indecs>/DOI ideas.

Question: We won't be able to define precisely what work is. We should avoid defining it. While broad categories of interactions may have been studied, do you think your publishing model is extensible to images, text, font? To Godfrey Rust: Do you think we can extend this to publishing of aggregations, e.g. written book by an author combined with paper it is printed on ?

Godfrey Rust (<indecs>): Take a look at indecs papers. The answer is yes, I think this is possible.

Answer: The same is true for ONIX - thinking about selling pieces of a work

Danny Weitzner (W3C): Commenting on Norman's point of the Web not being the univerals information space, I think that in the discussions so far we showed lots of attention on commercial needs. The question is whether we would like common framework for discovering rights of document whether or not produced principally for trade or not? E.g., does a picture of my 3-year old fit into this framework? We risk to produce big costs if there are two classes of documents: One that fits into trading and others who don't. If we look at music, we see, that non-traditional documents are traded. I'm a bit concerned about the application of these systems only to "trade" items in the web, ignoring or disenfranchising the little objects which also have rights associated with them.

Godfrey Rust (<indecs>):The model we worked on are neutral as far as commerce is concerned. It can be used for picture of 3 year old. We haven't actually developed framework though. A critical piece of work is to decide what those verbs (note: for the actions) are. The model still needs a lot of detailed work. We have roughly agreed on the direction we take. Please don't overestimate what we've done.

Norman Paskin (DOI):: <indecs>/DOI is about transactions. We mean by transaction anything, whether it's free or not. We focused on e-commerce. Our economic model is based on the barcode model. For some transactions, the financial cost will be zero.

John Erickson (HP): What is the methodology for rationalising to interpret new dimensions for a given problem space? There has been a lot of talk about notion of ontologies. We have a certain way of thinking about a problem, and perhaps another way. Now we try to find ontologies to identify common points. Where does the notion of rationalizing problem spaces conflict with ontologies? I can see that it resonates in a closed room ...

Rob Koenen, MPEG: MPEG-7 is standard for describing content. Based on XML schema, there are principle notions like actor, people etc. Those are listed in a concepts list. There are basic concepts like shape, color etc. People can build their own ontology. MPEG has just decided to do a data dictionary for a rights language. MPEG has issued a call for requirements on 19 January 2001. Koenen invited W3C to work with MPEG on working on this problem

Peter Schirling (IBM & MPEG): We should try to avoid unnecessary duplication. We allow each discipline to build an ontology from a common frame, to reduce duplication of elements. Under that framework, different sectors can add their own things to a specific concepts list. Currently, we have only concept lists very specific to audio-visual content.

Scott Foshee (Adobe): There are two classes of objects (things): Under control and not under control. I wanted to state my agreement with Danny Weitzner, that there should only be one class. Clipart in a presentation software is an aggregation with content you created. By using a product that was licensed, everything on a harddisk is aggregated content work. DRM should be able to handle that. Take a picture of Danny Weitzner and apply a filter may result in aggregate work. The process that is applied results in another object. We need to get something that is workable, because this technology will be everywhere

Coffee Break, but not enough coffee for some

[Minutes Overview] [Workshop Home] [Previous:Publishers Requirements] [Next:Architecture: Interoperability and Standards]

Created by Rigo Wenning February 2001
Last update $Date: 2001/04/18 16:50:31 $ by $Author: rigo $