Re: PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML Serialization]

On Mar 28, 2013, at 10:43 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:

> 
> Hi Stephan,
> 
> Thanks you have answered my earlier question.
> This looks awkward indeed.
> 
> More below:
> 
> On 03/28/2013 04:06 PM, Stephan Zednik wrote:
>> The proposed solutions leads to the possibility of multiple documentBundles being in a given document.
>> 
>> <prov:document
>> 	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>> 	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>>     xmlns:prov="http://www.w3.org/ns/prov#"
>>     xmlns:ex="http://example.com/ns/ex#">
>> 
>>   <prov:documentBundle>
>>     <!-- statements -->
>>   </prov:documentBundle>
>>   <prov:documentBundle>
>>     <!-- statements -->
>>   </prov:documentBundle>
>>   <prov:documentBundle>
>>     <!-- statements -->
>>   </prov:documentBundle>
>> 
>> </prov:document>
>> 
>> This is an interesting scenario we had not accounted for previously.
>> 
>> These document bundles are differentiated from regular bundles in that they do not support the prov:id attribute.  This element also does not have a corresponding concept in PROV-N.  I think this could cause confusion.  I do not know how to justify/explain an xml document with multiple documentBundles.
>> 
>> I think I prefer to option to enforce ordering of bundle constructors, non-bundle prov-statements, and xsd:any (see following) over introducing a documentBundle element that is not clearly differentiated from the existing bundleConstructor and which does not correspond to a concept from the DM or any other serialization.
>> 
>>   <xs:complexType name="Document">
>>     <xs:sequence>
>>         <xs:group ref="prov:documentElements" minOccurs="0" maxOccurs="unbounded"/>
>> 		<xs:element name="bundleContent" type="prov:BundleConstructor" minOccurs="0" maxOccurs="unbounded"/>
>>         <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
>>     </xs:sequence>
>>   </xs:complexType>
> 
> We could probably even have documentElements and bundcontents in any order.

If I understand the cause of the issue correctly, our problem is occurring because of the unbounded sequence within the complexType Document.  We also have an unbounded sequence in the BundleConstructor sequence that will likely be causing a similar issue.

What if we 

1) include the bundleContent element in the documentElements group
2) create a bundleElements group that is similar to documentElements except that it does not have the bundleContent element (no bundleContent nesting allowed)
3) Update the Document complex type to the following:

  <xs:complexType name="Document">
    <xs:sequence>
        <xs:group ref="prov:documentElements" minOccurs="0" maxOccurs="unbounded"/>
        <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>

4) update the BundleCostructor complex type to the following (removing its unbounded sequence)

<xs:complexType name="BundleConstructor">
	<xs:sequence>
		<xs:group ref="prov:bundleElements" maxOccurs="unbounded"/>
		<xs:any namespace="##other" processContents="lax" minOccurs="0"/>
	</xs:sequence>
	<xs:attribute ref="prov:id"/>
  </xs:complexType>

All current examples validate with this change, but I am not sure if it results in a more useful JAXB generated class.  I have attached the generated Document and BundleConstructor java classes.

Luc, what do you think of these classes?  Are they still too awkward?

--Stephan
> 
> The other options is to introduce another prov element:
> 
> <prov:extensibility>
>   ... here non prov statements
> </prov:extensibility>
> 
> allowed anywhere.
> 
> Luc
> 
> 
>> 
>> Also, I like the idea of leaving the schema as it is if a jaxb binding file can provide a solution and detailing that jaxb-specific solution in a FAQ entry, but I agree that making a change to the schema to prevent this issue from occurring would probably be the better and more visible solution.
>> 
>> --Stephan
>> 
>> On Mar 28, 2013, at 9:04 AM, Stephan Zednik <zednis@rpi.edu> wrote:
>> 
>>> Hi Hook,
>>> 
>>> Thanks for looking into this.  I would like to test out the proposed solution and I will provide feedback by the EOD.
>>> 
>>> Luc, would the proposed solution resolve this issue?
>>> 
>>> --Stephan
>>> 
>>> On Mar 28, 2013, at 4:40 AM, "Hua, Hook (388C)" <hook.hua@jpl.nasa.gov> wrote:
>>> 
>>>> Hi Luc and Stephan,
>>>> 
>>>> Somehow, with jaxb-ri-2.2.6, the removal of xsd:any still generates
>>>> JAXBElements.
>>>> 
>>>> Hopefully we may not need to modify the xsd:any support nor use customized
>>>> bindings mapping for JAXB. In looking into it further, I believe I have
>>>> found a more upstream cause and a potentially cleaner solution.
>>>> 
>>>> Given that we have the following unfriendly XML binding mapping:
>>>> 
>>>> ------------------------------------------------------
>>>> <xs:element name="document" type="prov:Document" />
>>>> 
>>>> <xs:complexType name="Document">
>>>> <xs:sequence maxOccurs="unbounded">
>>>>   <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>>   <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>> minOccurs="0"/>
>>>>   <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>> </xs:sequence>
>>>> </xs:complexType>
>>>> 
>>>> ---------
>>>> 
>>>> 
>>>> public class Document {
>>>> @XmlElementRefs({
>>>> @XmlElementRef(name = "wasRevisionOf", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>>> @XmlElementRef(name = "activity", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>>> @XmlElementRef(name = "collection", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>>> @XmlElementRef(name = "bundle", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>>> @XmlElementRef(name = "wasQuotedFrom", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>>> @XmlElementRef(name = "wasInvalidatedBy", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>>> ...
>>>> })
>>>> @XmlAnyElement(lax = true)
>>>> protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>> ...
>>>> ------------------------------------------------------
>>>> 
>>>> It appears that to help retain a round-trip marshalling/unmarshalling of
>>>> our prov:Document, the unbounded sequence of its elements (including
>>>> prov:BundleConstructor) must be uniquely distinguished by JAXB. The
>>>> repeating sequences are treated as a List<Object> of generic JAXBElements,
>>>> where the JAXBElemnt's QName is used to distinguish elements with
>>>> different names. So the culprit may be the unbounded sequence.
>>>> 
>>>> <xs:sequence maxOccurs="unbounded">
>>>> <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>> <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>> minOccurs="0"/>
>>>> <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>> </xs:sequence>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> What if we move the unbounded occurrence into a wrapper complex type and
>>>> keep the sequence singular? Below, I've introduced a "prov:DocumentBundle"
>>>> wrapper complex type in which to apply the unbounded occurrence to. Then
>>>> in "prov:DocumentBundle", maintain the same subelements as before, but as
>>>> one occurrence of the sequence. Running it through JAXB now generates the
>>>> cleaner prov-typed List elements. No customized bindings for JAXB needed.
>>>> No removal of xsd:any needed.
>>>> 
>>>> 
>>>> ------------------------------------------------------
>>>> <xs:element name="document" type="prov:Document" />
>>>> 
>>>> <xs:complexType name="DocumentBundle">
>>>> <xs:sequence>
>>>>   <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>>   <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>> minOccurs="0"/>
>>>>   <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>> </xs:sequence>
>>>> </xs:complexType>
>>>> 
>>>> <xs:complexType name="Document">
>>>> <xs:sequence>
>>>>   <xs:element name="documentBundle" type="prov:DocumentBundle"
>>>> minOccurs="0" maxOccurs="unbounded"/>
>>>> </xs:sequence>
>>>> </xs:complexType>
>>>> 
>>>> ---------
>>>> 
>>>> public class Document {
>>>> protected DocumentBundle documentBundle;
>>>> ...
>>>> 
>>>> public class DocumentBundle {
>>>> protected List<Entity> entity;
>>>> protected List<Activity> activity;
>>>> protected List<Generation> wasGeneratedBy;
>>>> protected List<Usage> used;
>>>> protected List<Communication> wasInformedBy;
>>>> protected List<Start> wasStartedBy;
>>>> protected List<End> wasEndedBy;
>>>> protected List<Invalidation> wasInvalidatedBy;
>>>> ...
>>>> ------------------------------------------------------
>>>> 
>>>> We could also rename the wrapper "prov:DocumentBundle" to something else
>>>> reduce possible confusion with prov:Bundle and prov:BundleConstructor.
>>>> 
>>>> 
>>>> 
>>>> I think we need to understand that this approach introduces another
>>>> indirection artifact in the PROV-XML encoding. Would this be an acceptable
>>>> compromise approach around the JAXBElement issue?
>>>> 
>>>> --Hook
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On 3/21/13 8:49 AM, "Stephan Zednik" <zednis@rpi.edu> wrote:
>>>> 
>>>>> 
>>>>> On Mar 21, 2013, at 5:26 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:
>>>>> 
>>>>>> H Hook,
>>>>>> 
>>>>>> Thanks for this analysis.
>>>>>> 
>>>>>> In this specific instance, I think that it is the element
>>>>>> <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>>>> occurring inside
>>>>>> <xs:sequence maxOccurs="unbounded">
>>>>>> that causes these jaxb elements to be generated.
>>>>>> 
>>>>>> If you were to remove xsd:any there, jaxbElements would no longer be
>>>>>> generated.
>>>>>> 
>>>>>> While we want to allow the possibility of elements from other schemas,
>>>>>> do we
>>>>>> really want to allow them any where inside a document/bundle?
>>>>> We want to provide for elements from other schemas but I don't think we
>>>>> formally identified what areas we intend to allow non-prov elements in
>>>>> before we added this functionality to the schema.
>>>>> 
>>>>> What if we made a FAQ entry about OXM mappings with PROV-XML and created
>>>>> a customized schema or bindings file specifically for JAXB code
>>>>> generation?  This would allow us to work on asynchronously with the
>>>>> document and past the note publication, it would also allow us to
>>>>> introduce JAXB-specific solutions that I do not think make sense in the
>>>>> official schema or note.
>>>>> 
>>>>> --Stephan
>>>>> 
>>>>>> Luc
>>>>>> 
>>>>>> 
>>>>>> On 03/21/2013 11:09 AM, Hua, Hook (388C) wrote:
>>>>>>> Hi Luc,
>>>>>>> 
>>>>>>> I'm using jaxb-ri-2.2.6 against our latest prov*.xsd and seeing
>>>>>>> slightly
>>>>>>> different bindings with JAXBElement:
>>>>>>> 
>>>>>>> public class Document {
>>>>>>>   @XmlElementRefs({
>>>>>>>       @XmlElementRef(name = "hadPrimarySource", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>>> false),
>>>>>>>       @XmlElementRef(name = "agent", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>>> false),
>>>>>>>       @XmlElementRef(name = "activity", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>>> false),
>>>>>>>       @XmlElementRef(name = "organization", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>>> false),
>>>>>>>       @XmlElementRef(name = "softwareAgent", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>>> false),
>>>>>>> ....
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Some findings:
>>>>>>> 
>>>>>>> 
>>>>>>> (1) JAXB's generation of JAXBElement<T> classes seems to be a wrapper
>>>>>>> approach to preserve sufficient information in the schema for
>>>>>>> round-trip
>>>>>>> marshaling & unmarshalling of values in XML instances. More
>>>>>>> specifically,
>>>>>>> it wraps the data with a QName and a nillable flag [1].
>>>>>>> 
>>>>>>> It appears that the a frequent cause of JAXB producing JAXBElement<T>
>>>>>>> is
>>>>>>> its attempt to preserve elements with both minOccurs=0 and
>>>>>>> nillable=true.
>>>>>>> JAXB needs to distinguish between the two cases where:
>>>>>>> 
>>>>>>> a. element missing, minOccurs=0, then jaxbElement==null
>>>>>>> b. element present, xsi:nil=true, then jaxbElement.isNil()==true
>>>>>>> 
>>>>>>> It would not be possible to distinguish between these two states if the
>>>>>>> bindings were the raw types.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> (2) It would be possible to customize the JAXB bindings [2] to ignore
>>>>>>> the
>>>>>>> full round-trip requirement. The "generateElementProperty=false"
>>>>>>> customization option "can be used to generate an alternate developer
>>>>>>> friendly but lossy binding" [3].
>>>>>>> 
>>>>>>> I tried variations of a "bindings.xjb" customization file:
>>>>>>> 
>>>>>>> <jaxb:bindings version="2.1"
>>>>>>> xmlns:jaxb="http://java.sun.com/xml/ns/jaxb"
>>>>>>> xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
>>>>>>> xmlns:xs="http://www.w3.org/2001/XMLSchema">
>>>>>>> <jaxb:bindings schemaLocation="prov-core.xsd"
>>>>>>>   node="//xs:complexType[@name='Document']">
>>>>>>>   <jaxb:globalBindings generateElementProperty="false" />
>>>>>>> </jaxb:bindings>
>>>>>>> </jaxb:bindings>
>>>>>>> 
>>>>>>> $ xjc.sh -d BINDINGS -b bindings.xjb prov.xsd
>>>>>>> 
>>>>>>> But none truly eliminated the JAXBElement<T> from the bindings.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> (3) Nowhere in our prov-core.xsd do we define minOccurs=0 in
>>>>>>> conjunction
>>>>>>> with nillable=true. In my attempts with JAXB, I'm seeing JAXBElements
>>>>>>> appearing in the bindings for the (a) Document class and (b)
>>>>>>> BundledConstructor class. Both types leverage the prov:documentElements
>>>>>>> grouping.
>>>>>>> 
>>>>>>> <xs:element name="document" type="prov:Document" />
>>>>>>> <xs:complexType name="Document">
>>>>>>> <xs:sequence maxOccurs="unbounded">
>>>>>>>   <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>>>>>   <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>>>>> minOccurs="0"/>
>>>>>>>   <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>>>>> </xs:sequence>
>>>>>>> </xs:complexType>
>>>>>>> 
>>>>>>> It's unclear if there is some nillable-like affect that triggers JAXB
>>>>>>> to
>>>>>>> generate the JAXBElements.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> (4) On the upside, JAXB does provide an ObjectFactory class as part of
>>>>>>> the
>>>>>>> generated bindings that define creational factory methods to generate
>>>>>>> the
>>>>>>> JAXBElement instances. For example:
>>>>>>> 
>>>>>>> public JAXBElement<Usage> createUsed(Usage value)
>>>>>>> 
>>>>>>> Still, I agree that it is not as clean.
>>>>>>> 
>>>>>>> 
>>>>>>> --Hook
>>>>>>> 
>>>>>>> 
>>>>>>> [1] http://docs.oracle.com/javaee/5/api/javax/xml/bind/JAXBElement.html
>>>>>>> [2]
>>>>>>> 
>>>>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.5/tut
>>>>>>> ori
>>>>>>> al/doc/JAXBUsing4.html#wp148515
>>>>>>> [3]
>>>>>>> 
>>>>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/reference/tu
>>>>>>> tor
>>>>>>> ials/wsit/doc/DataBinding5.html
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On 3/8/13 4:20 AM, "Provenance Working Group Issue Tracker"
>>>>>>> <sysbot+tracker@w3.org> wrote:
>>>>>>> 
>>>>>>>> PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML
>>>>>>>> Serialization]
>>>>>>>> 
>>>>>>>> http://www.w3.org/2011/prov/track/issues/648
>>>>>>>> 
>>>>>>>> Raised by: Luc Moreau
>>>>>>>> On product: XML Serialization
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Hi
>>>>>>>> 
>>>>>>>> I have ported the ProvToolbox and the ProvValidator to the new XML
>>>>>>>> schema.
>>>>>>>> I just wanted to report on my experience with the schema and JAXB.
>>>>>>>> Obviously, others may have better experience with JAXB and may be able
>>>>>>>> to help on some of the issues I encountered.
>>>>>>>> 
>>>>>>>> Everything worked fine, except:
>>>>>>>> - <xs:element ref="prov:internalElement abstract=true/>
>>>>>>>> - extensibility <xs:any namespace="##other"/> in Document and Bundle
>>>>>>>> 
>>>>>>>> 
>>>>>>>> These two constructs, while processable by JAXB, are not
>>>>>>>> JAXB-friendly.
>>>>>>>> 
>>>>>>>> Indeed, JAXB compiles the schema in a list containing all possible
>>>>>>>> statements.
>>>>>>>> 
>>>>>>>>  protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>>>>> 
>>>>>>>> However, the presence on an abstract element and an <any/> element
>>>>>>>> result
>>>>>>>> in the
>>>>>>>> content of that list to be of type:
>>>>>>>> 
>>>>>>>> 
>>>>>>>>  @XmlElementRefs({
>>>>>>>>      @XmlElementRef(name = "used", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>>      @XmlElementRef(name = "wasAssociatedWith", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>>      @XmlElementRef(name = "person", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>>      @XmlElementRef(name = "entity", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>>      @XmlElementRef(name = "wasInfluencedBy", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#"
>>>>>>>> ....
>>>>>>>>  })
>>>>>>>> 
>>>>>>>>  @XmlAnyElement(lax = true)
>>>>>>>>  protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>>>>> 
>>>>>>>> where all data structures are wrapped up in this unpleasant
>>>>>>>> JAXBElement.
>>>>>>>> 
>>>>>>>> Without these features, we get a much more natural mapping:
>>>>>>>>  @XmlElements({
>>>>>>>>      @XmlElement(name = "entity", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = Entity.class),
>>>>>>>>      @XmlElement(name = "activity", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = Activity.class),
>>>>>>>>      @XmlElement(name = "wasGeneratedBy", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = WasGeneratedBy.class),
>>>>>>>>      @XmlElement(name = "used", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = Used.class),
>>>>>>>>      @XmlElement(name = "wasInformedBy", namespace =
>>>>>>>> "http://www.w3.org/ns/prov#", type = WasInformedBy.class),
>>>>>>>>  ...
>>>>>>>> })
>>>>>>>> 
>>>>>>>> So, how I did I solve the problem?  I inserted the extension schemas
>>>>>>>> into
>>>>>>>> the schema file, and hence got rid of the abstract element.  I am ok
>>>>>>>> with
>>>>>>>> this. We could possible provide the utility to that transformation.
>>>>>>>> 
>>>>>>>> For the extensibility, I used a different definition. It happens to
>>>>>>>> parse prov-xml compliant xml. When serializing, it  puts all
>>>>>>>> extensibility elements at the end.  This is not a satisfactory
>>>>>>>> solution, and is likely to be dependent of the jaxb implementation
>>>>>>>> (though I am not entirely sure).
>>>>>>>> 
>>>>>>>> 
>>>>>>>> <xs:complexType name="Document">
>>>>>>>>   <xs:sequence>
>>>>>>>>     <xs:choice maxOccurs="unbounded">
>>>>>>>>       <xs:group ref="prov:documentElements"/>
>>>>>>>>       <xs:element name="bundleContent" type="prov:NamedBundle"/>
>>>>>>>>     </xs:choice>
>>>>>>>>     <xs:any namespace="##other" processContents="lax" minOccurs="0"
>>>>>>>> maxOccurs="unbounded"/>
>>>>>>>>   </xs:sequence>
>>>>>>>> </xs:complexType>
>>>>>>>> 
>>>>>>>> Can something be done to make the XML schema a bit more jaxb friendly,
>>>>>>>> while still keeping the same flexibility?  Thoughts welcome.
>>>>>>>> 
>>>>>>>> Cheers,
>>>>>>>> Luc
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> -- 
>>>>>> Professor Luc Moreau
>>>>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>>>>> University of Southampton          fax:   +44 23 8059 2865
>>>>>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>>>>>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
> 
> -- 
> Professor Luc Moreau
> Electronics and Computer Science   tel:   +44 23 8059 4487
> University of Southampton          fax:   +44 23 8059 2865
> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
> 
> 
> 

Received on Thursday, 28 March 2013 17:32:11 UTC