Prov-XML ElementOrdering

From Provenance WG Wiki
Jump to: navigation, search

Purpose

This page contains discussions on ordering of elements/attributes within the PROV-XML schema. It summarizes ISSUE 572.

Description

(Caution, the word "attributes" is overloaded between PROV-N and XML -- consider context for the meaning of that word.)

Within the PROV-N specification for most of our types, the syntax is something like this:

something(id; a, b, c, attrs);

where

  • a,b,c are either required or optional fields and
  • attrs are optional attributes to be specified as a set of

attribute-value pairs

Those attributes can include specific prov:* fields like prov:location, prov:role, etc. or other fields in some other namespace.

The current approach to developing the XML complexType for each of those is to:

  • make prov:id an attribute
  • make each of the a,b,c, XML elements and to require they be

specified in the same order as the PROV-N

  • also make the attributes XML elements, require that they be

after the primary a,b,c fields, but they can be specified in any order.

The XML schema for the type becomes something like this:

<xs:complexType name="something">
  <xs:sequence>
    <xs:element name="a" type="..."/>
    <xs:element name="b" type="..."/>
    <xs:element name="c" type="..." minOccurs="0"/> (optional)
    <xs:choice minOccurs="0" maxOccurs="unbounded">
      <xs:element name="location"/>
      <xs:element name="role"/>
      <xs:element name="label"/>
      <xs:element name="type"/>
      <xs:any namespace="##other"/>
    </xs:choice>
  </xs:sequence>
  <xs:attribute ref="prov:id"/>
</xs:complexType>

So, for example:

used(a1, e1, 2011-11-16T16:00:00, [ ex:parameter="p1" ])

becomes

<prov:used>
  <prov:activity prov:ref="a1"/>
  <prov:entity prov:ref="e1"/>
  <prov:time>2011-11-16T16:00:00</prov:time>
  <ex:parameter>p1</ex:parameter>
</prov:used>


In particular, the use of the "xs:choice" allows:

used(a1, e1, [prov:location="room a", prov:type="mytype"])

to be expressed with

<prov:used>
  <prov:activity prov:ref="a1"/>
  <prov:entity prov:ref="e1"/>
  <prov:location>room a</prov:location>
  <prov:type>mytype</prov:type>
</prov:used>

or

<prov:used>
  <prov:activity prov:ref="a1"/>
  <prov:entity prov:ref="e1"/>
  <prov:type>mytype</prov:type>
  <prov:location>room a</prov:location>
</prov:used>

They are both equally valid.

Concern

The "xs:choice" allowing the extra attributes to be specified in any order ironically also indicates to automated object model builders that the order expressed actually does matter and their model must preserve it.

This is a well known, widely reported problem with jaxb, the Java Architecture for XML Binding.

Specifically, it builds a single internal list of all the attributes, called, e.g. LocationOrRoleOrLabel... which has all the attributes listed in the order they are parsed. Working with the attributes requires iterating through the list and figuring out for each item whether or not it is the attribute you are looking for. This makes the automatic bindings awkward to work with.

Analysis

There are a number of community recommended options for coping with this issue (since it occurs so frequently):

  1. Develop a separate schema removing the choice, build the Java model from that secondary schema.
  2. Add annotations/binding recommendations that can drive jaxb to build it's bindings the way you want them built.
  3. A jaxb plugin Simplify can use a similar annotation that basically splits up the choice options into individual lists/methods (locations, types, roles, etc.)

Options

  1. Keep the choice in the official schema and:
    1. Add a FAQ recommending jaxb users use one of the options.
    2. Actually create (and maintain) a separate schema that will force jaxb to do the right thing.
    3. Embed jaxb application specific annotations that make jaxb do the right thing.
    4. Create an external jaxb binding recommendations file that can be used with jaxb to make it do the right thing.
  2. Just get rid of the choice and force users to use the attributes in a specific order.

Recommendation

At this point, it would be easiest to just do away with the choice and describe the required order of attributes in the documentation for PROV-XML.