WEBDAV Minneapolis IETF minutes

Below are the preliminary minutes from the WebDAV meeting at the Minneapolis
IETF.  If you have any comments/corrections, please send email to
<ejw@ics.uci.edu>.

Thanks!

- Jim

----------------------
Meeting Minutes
WEBDAV WG
Minneapolis IETF
March 17, 1999

The WEBDAV working group met at the Minneapolis IETF, on March 17,
1999, from 15:30 to 17:30.  The meeting was chaired by Jim Whitehead,
and Yaron Goland recorded notes.  Approximately 55 people attended.

The meeting began with a brief review of the agenda (overview of
DELTA-V BOF, issues from the Advanced Collections specification,
creating a property registry, moving access control forward).

DELTA-V BOF PRESENTATION

Jim Amsden gave a brief presentation on the DELTA-V BOF, which was
held in the previous session.  The presentation gave an overview of
the scope of the effort proposed in the DELTA-V charter.

Jim's presentation began with a short history of why WebDAV was
created. WebDAV, when it went to create document management features
found that versioning was critical and included it from the start. As
WebDAV progressed it was found that versioning was very hard and that
it required its own working group. DELTA-V is that proposed working
group.

The protocol that is proposed for DELTA-V will contain the following
features:

   Versioning - Ability for a resource to be checked into a version
   controlled system where it has multiple revisions that are tracked
   and can have multiple successor and predecessor relationships. The
   server will maintain those relationships, report the revision
   history, and control the write able access to these revisions using
   check in/out operations.

   Parallel Development - Provides more resource availability in a
   multi-user environment. Multiple users can check-out the same
   revisions of a resource and track who has those check-outs and to
   merge them back into each other later on as appropriate.

   Configuration Management - A means of to collecting a group of
   consistent revisions of resources together. The protocol will
   support creating configurations, putting revisions in them and
   tracking them over time.

Jim Amsden finished by reporting that the first BOF was just
completed, and seemed to be reasonable success. There is a mailing
list ietf-dav-versioning@w3.org, which is archived.

ADVANCED COLLECTIONS PROTOCOL

After Jim Amsden's presentation, the floor was turned over to Judy
Slein for discussion of issues from the Advanced Collections protocol
specification.

The first issue concerned what the default behavior should be for
certain methods like copy and lock when applied to references.  There
are two types of references in the Advanced Collections protocol
specification:

   Redirect - for servers that want to provide basic referencing
   capabilities at minimal cost, the server never acts as a proxy
   (i.e., the server does not forward methods along to the target
   resources) but the disadvantage is that the reference is very
   visible to clients, and clients have to take actions (based on the
   returned 3xx status code) to resolve the reference.

   Direct references - cheap for clients but expensive for
   servers. The servers resolve these automatically and provide the
   illusion that the client is working directly on the referenced
   resource (a.k.a. target resource).

The general rule of thumb for default behavior is that when you apply
a method to a redirect reference you get a 302 response in the
location header and that response gives you the URI of the target
resource. The default behavior for direct reference is for the server
to automatically apply the method to the target resource, itself.
Judy stated that in the ideal, these default behaviors should be the
rule for all methods, but there are some cases which make this not
possible. During discussions on this issue, the spec. authors have
developed principles to deal with situations where the default
behaviors do not apply.  These discussions led to the realization that
there are four cases when determining the behavior of a method when
references are present: Redirect references, Collections that contain
redirect references, Direct references, Collections that contain
direct references.

The first rules the authors developed was to ensure that if a method
is applied to a single direct reference and or if the same method is
applied to a direct reference in a collection, the behavior of the
method will be the same. The same logic applies for redirect
references. So we really have two cases. We would like to be able to
do the same thing for redirect and direct references, either apply to
the target or not. But we haven't been able to do that.

At this point there some Q&A from the attendees.  One attendee asked
why there are both direct and redirect references?

Some arguments from this discussion in favor of having both direct and
redirect references:

  - Redirect resources are easier for servers to implement than direct
    references
  - Security: the server may not want to perform an action on behalf
    of the client because of the security implications (and hence
    would either not want to implement direct references, or would
    limit the target to be a resource on the same server (or
    administrative domain) as the reference
  - Redirect references can have a target which is not an http scheme
    URL (e.g., ftp or gopher URLs), and it is unlikely that a server
    would proxy HTTP commands (some of which do not have equivalents
    in other protocols) to allow direct references to these URLs.
  - Servers already provide a redirect capability, and creating a
    redirect is performed by out-of-band mechanisms.  Redirect
    references provide a mechanism for remotely authoring, via HTTP,
    these redirects.  File systems often contain both kinds of
    reference, both direct and redirect-style, and it would be useful
    to be able to author both kinds of reference.

Arguments raised against having both kinds of reference:

  - Direct references appear to have the same set of features as redirect
    references, so why have both?
  - If a client is redirected to the target resource (via a redirect
    reference) once, that is more efficient than if a server constantly
    forwards requests, as is the case with direct references.

Judy closed discussion by noting the issue that the specification may
not necessarily need two different kinds of references. She also noted
that the specification was unlikely to change so fundamentally at this
point.

Judy continued her presentation.  Judy observed that the specification
authors have developed a set of design principles which are not
orthogonal, and must be traded off for some methods. The authors have
developed the following principles:

- All references should be usable by down-level clients and the
  default behavior should be what makes the most sense to a down-level
  client.
- The behavior of a method applied to a referential resource should be
  consistent, whether it is applied to an individual references or a
  reference encountered when processing a collection.
- A server should never need to resolve a redirect reference and act as a
proxy
  (we never violated this).
- Behavior should be consistent across all methods as far as possible.
- We want to be consistent with WebDAV and HTTP semantics for
  methods.

Unfortunately, these principles lead to conflicting design choices for
some methods.

Applying the principles is easy for the methods: GET, HEAD, OPTIONS,
PUT, POST, MKCOL, MKREF, PROPPATCH, and PROPFIND. For a redirect
reference you respond with a 302. For a direct redference you apply to
the target.

The more difficult methods are DELETE and MOVE.  For these, always
apply the method to the redirect reference resource itself, and also
apply the method to the direct reference resource. For COPY, LOCK and
UNLOCK, the method is applied to the redirect reference resource,
while for direct references, the method is applied to the target.
There is no consensus on the last three.

Judy noted that for MOVE and DELETE, there appears to be consensus
because their semantics are similar to those supported by file
systems. The rationale for applying them to the reference resource,
rather than its target, is that MOVE and DELETE affect the membership
of collections, and it would be undesirable if MOVE and DELETE,
through secondary effects, modified the membership of the target
collections.  There was general agreement in the room on this point.

COPY FOR REFERENCES

Judy went on the semantics of COPY.  For COPY, the expectation is that
the destination of a COPY should be a new resource, and operations on
that new resource do not affect the original resource. However, what
is the expectation if you copy a collection with references? Is the
expectation the new collection will have copies of the references or
of the targets? If you want to get 302 in all the same cases then you
want to copy the references. If you want to have safe resources to
play with then you want the targets to be copied.

Discussion on this topic then ensued. One thread of discussion
concerned wther the behavior of COPY on individual resources should be
the same as COPY on collections of resources.  Some attendees noted
that copying collections is a difficult, and option-laden activity in
operating systems, and in programming languages (e.g. LISP has five
different copy operators based on various conditions).  Choices in
programming languages haven't been encouraging: either they make one
wrong choice, which leads to people creating many different types of
copy, or they choose one and tell everyone else to go away.  It was
noted that one source of underlying difficulty with COPY is that the
term copy has lots of different meanings for computers, and for paper
too.

Larry Masinter then proposed that, the safest thing is to perform the
least amount of work, and hence copy of a reference should always just
copy the reference resource, and not the target.  Of the two choices,
copy the reference or copy the target, copying the reference is the
least amount of work for the server.  Mark Day noted while the choices
in the protocol should surprise the least number of people, it's not
always possible to avoid surprising a signifigant part of the
population. We have to be ready to make an arbitrary choice because we
can't converge on the easy to use solution.  Larry Masinter then
proposed that the protocol specification document indicate clearly
that copy is complicated, and that users will have different
expectations of what copy means.  There was general agreement in the
room for the "do the least amount of work, but document the
difficulty" approach.

LOCK FOR REFERENCES

Discussion then moved onto LOCK.

It was noted that returning one 302 response for each reference in a
collection of redirect references would cause LOCK to fail (since it
has all-or-nothing semantics).  Hence, locking the reference is the
desired functionality for redirect references.  Direct references have
their own set of problems.

One attendee noted that here are four possibile choices when locking a
direct reference. Either lock the target, lock the reference, lock
both, or lock neither. Neither is out.  Like copy you can make very
reasonable cases for all, and like copy the choices seem very
arbitrary.  Locking the target makes a lot of sense except that you
could move or delete the reference. So if the target is locked and
someone moves or deletes the reference, that might be
surprising to the client. Or it might not be. How often do you go and
try to move or delete something that someone has locked.

One person noted that the point of a lock is to protect the contents
of the persistent state of the resource. Locking the target would at
least honor those semantics.  However, another person noted that a
lock affects both the contents and the namespace, and a lock on a
reference needs to protect against both namespace operations and
content modification operations.

A proposal was made that a lock on a direct reference should lock the
target, but cause the reference to behave as if it were locked.  That
is, MOVE and DELETE on the reference would fail if the target were
locked.  There was some disucssion on the impact this might have on
the No-passthrough header. There was a proposal that if the target of
a reference is locked then operations that are performed without the
no passthrough header behave as if the reference is also locked.
However if they are supplied with no passthrough then they do not.
The particular case is DELETE.  There is also a need to define how
references behave when their target is locked.  Some, but not all
people on the room appeared to support this proposal.

IMPACT OF REFERENCES ON URL RESOLUTION

The crux of this issue is, if you create a reference to a collection,
are you forcing the server to create references to each member of that
collection? The answer of the specification authors is, no you are
not, because the server doesn't need all those additional references.

As an example, suppose there is a direct reference called BLAH to a
target which is a collection called FOO, and FOO contains a member
called BAR.  If a client performs a GET on BLAH/BAR, the specification
authors say BLAH/BAR is just an alternate URL, while Yaron Goland
insists that, from an HTTP perspective, BLAH is a resource, and so is
BLAH/BAR.  But is it a direct reference? Are there operations which
can be peformed on FOO/BAR that cannot be applied to BLAH/BAR?

Filesystems solve this problem by making both foo and blah pointer
objects, and ref-counting bar. UNIX prevents this by barring
hard-links to a collection.

One attendee suggested that if BLAH is a direct reference to a
collection, it should only support operations on BLAH, not on URLs
which are BLAH/x. However, this approach was considered, and rejected,
in a authors' group teleconferences because there is an expectation
that references support this kind of namespace operation.

Larry Masinter noted that it is bothersome that BLAH/BAR is not a
*direct* reference, or at least behaves like one. You can discover its
target, so it quacks like a reference. Only operations only on that
name -- without no-passthrough -- behave in a little different way.
Judy responded that even before references came along, you could
already have multiple URLs for the same resource -- this is the same.

Yaron stated that there may not be a problem here. However, this
namespace redirection action is sufficiently novel that it deserves to
be addressed in the specification more than it currently is.

PROPERTY REGISTRY

Jim Whitehead then began discussion on having a registry for WebDAV
properties.  The discussion began with a brief overview of WebDAV
properties.  Properties are name/value pairs where each name is a URI,
which could be a URL. A nice quality of names being URLs is that if
you want to define a new property for your use, then you can create a
new property, assign it a URL in a namespace you own, and you can have
a fresh name without running into any namespace collisions.

One attendee noted that a problem arises when concatenating namespaces
and element names, beacuse the XML community defined namespaces which
don't end in seperators, and hence you can't deconstruct a namespace
with an element name appended.  However, other participants noted that
WebDAV defines rules for concatenating a namespace and an element
name, and noted that XML namespaces delegates to its users the
definition of the namespace and element relationship.

Alex Hopmann stated that it is useful to have rules on how you take
the on the wire names and how you expose those as a single
identifier. If you are talking about a property registry where the
names is a full URI then you need a rule for how they are separated
into a namespace and a property (element) name.

Jim Whitehead stated that he would note the issue, and moved on.  The
value of a property is a sequence of well-formed XML. There are cases
when it is desirable to register properties. The property may have wider
utility than just your client or server, or people will get together
to create properties that have wider use. You need to register the
property name, the namespace.

One attendee asked whether it makes sense to register an entire
namespace, in order to assert that this is a namespace I am going to
use, and I want people to know.

Attendees also noted that it would be good to register whether the
property is live or dead, and that there should be a URL for getting
more (human readable) information about the properties.

Larry Masinter stated, in his view, a property registry is almost
useless. What people do with properties is they build protocols out of
them. Properties rarely have an independent, separable semantics from
other properties. People define a suite of properties and then
implement a protocol that goes along with it. The suite of properties
mean something together. Registering properties without defining their
relationship is like defining a registry of HTTP headers without
defining their relationships. Actually... a registry of HTTP headers
would be useful, so I take it back, but it is hardly enough to
understand what the HTTP headers are. A registry of XML elements
doesn't tell you the semantics of the constructions you build out of
them.  Defining a registry by defining the properties alone is
insufficient.

Larry continued, stating that you need the whole schema. Dublin core
kind of goes together. That is why it is a core. If you just took
author out of Dublin Core and didn't take providence or authority,
they don't really hold independently without the set.

There was some discussion over the utility of using a single property
from a schema.  Some participants felt that the author property from
Dublin Core was a good example of a property that should be widely
reusable.  This discussion highlighted a fundamental difference in
belief about property reuse: some participants felt that individual
properties could be reused, with other felt that individual properties
could not be reused, but sets of properties (schemas) which hang
together could be.

Another participant asked whether we see any reason why the success or
usage of this registry should be a different experience from the
registries of directory attributes or MIB objects (experience with
these registries has been negative).  There was no answer to this.

There was a brief discussion on the proper forum for registering
properties.  Jim Amsden asked, if the document management community
wants to get together and define properties to contain document life
cycle information, what is the forum that this group would have to
publish that schema of properties to share them, agree on them.  There
was a suggestion that they publish the schema as an Informational RFC.
However, this was perceived as being too formal, and there was concern
that other companies would not find this document in the RFCs listing.
The web site www.webdav.org was suggested as the place to go to find
these pointers.  Other organizations might also provide an index.

ACCESS CONTROL

With time already expired, there was a brief discussion on how to move
access control forward.  Jim Whitehead noted that he has heard that
the authors are not able to drive forward with the ACL draft.

There was a suggestion from the participants that, since there is at
most one person at a time who is willing to push forward with this
stuff, it isn't worth the interest of the IETF.  However, Jim
W. countered by noting that everytime he meets someone implementing a
WebDAV server their first criticism is "where is the access control?"
So the fact that no one is working on it here doesn't indicate that
there isn't a problem.

Jim Whitehead ended by calling for volunteers to work on the access
control protocol specification.

*** Meeting Adjourned ***

Received on Monday, 29 March 1999 20:33:34 UTC