Re: PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML Serialization]

Hi Stephan,

Thanks you have answered my earlier question.
This looks awkward indeed.

More below:

On 03/28/2013 04:06 PM, Stephan Zednik wrote:
> The proposed solutions leads to the possibility of multiple documentBundles being in a given document.
>
> <prov:document
> 	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
> 	xmlns:xsd="http://www.w3.org/2001/XMLSchema"
>      xmlns:prov="http://www.w3.org/ns/prov#"
>      xmlns:ex="http://example.com/ns/ex#">
>
>    <prov:documentBundle>
>      <!-- statements -->
>    </prov:documentBundle>
>    <prov:documentBundle>
>      <!-- statements -->
>    </prov:documentBundle>
>    <prov:documentBundle>
>      <!-- statements -->
>    </prov:documentBundle>
>
> </prov:document>
>
> This is an interesting scenario we had not accounted for previously.
>
> These document bundles are differentiated from regular bundles in that they do not support the prov:id attribute.  This element also does not have a corresponding concept in PROV-N.  I think this could cause confusion.  I do not know how to justify/explain an xml document with multiple documentBundles.
>
> I think I prefer to option to enforce ordering of bundle constructors, non-bundle prov-statements, and xsd:any (see following) over introducing a documentBundle element that is not clearly differentiated from the existing bundleConstructor and which does not correspond to a concept from the DM or any other serialization.
>
>    <xs:complexType name="Document">
>      <xs:sequence>
>          <xs:group ref="prov:documentElements" minOccurs="0" maxOccurs="unbounded"/>
> 		<xs:element name="bundleContent" type="prov:BundleConstructor" minOccurs="0" maxOccurs="unbounded"/>
>          <xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
>      </xs:sequence>
>    </xs:complexType>

We could probably even have documentElements and bundcontents in any order.

The other options is to introduce another prov element:

<prov:extensibility>
    ... here non prov statements
</prov:extensibility>

allowed anywhere.

Luc


>
> Also, I like the idea of leaving the schema as it is if a jaxb binding file can provide a solution and detailing that jaxb-specific solution in a FAQ entry, but I agree that making a change to the schema to prevent this issue from occurring would probably be the better and more visible solution.
>
> --Stephan
>
> On Mar 28, 2013, at 9:04 AM, Stephan Zednik <zednis@rpi.edu> wrote:
>
>> Hi Hook,
>>
>> Thanks for looking into this.  I would like to test out the proposed solution and I will provide feedback by the EOD.
>>
>> Luc, would the proposed solution resolve this issue?
>>
>> --Stephan
>>
>> On Mar 28, 2013, at 4:40 AM, "Hua, Hook (388C)" <hook.hua@jpl.nasa.gov> wrote:
>>
>>> Hi Luc and Stephan,
>>>
>>> Somehow, with jaxb-ri-2.2.6, the removal of xsd:any still generates
>>> JAXBElements.
>>>
>>> Hopefully we may not need to modify the xsd:any support nor use customized
>>> bindings mapping for JAXB. In looking into it further, I believe I have
>>> found a more upstream cause and a potentially cleaner solution.
>>>
>>> Given that we have the following unfriendly XML binding mapping:
>>>
>>> ------------------------------------------------------
>>> <xs:element name="document" type="prov:Document" />
>>>
>>> <xs:complexType name="Document">
>>> <xs:sequence maxOccurs="unbounded">
>>>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>    <xs:element name="bundleContent" type="prov:BundleConstructor"
>>> minOccurs="0"/>
>>>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> ---------
>>>
>>>
>>> public class Document {
>>> @XmlElementRefs({
>>> @XmlElementRef(name = "wasRevisionOf", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>> @XmlElementRef(name = "activity", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>> @XmlElementRef(name = "collection", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>> @XmlElementRef(name = "bundle", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>> @XmlElementRef(name = "wasQuotedFrom", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>> @XmlElementRef(name = "wasInvalidatedBy", namespace =
>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>>> ...
>>> })
>>> @XmlAnyElement(lax = true)
>>> protected List<Object> entityAndActivityAndWasGeneratedBy;
>>> ...
>>> ------------------------------------------------------
>>>
>>> It appears that to help retain a round-trip marshalling/unmarshalling of
>>> our prov:Document, the unbounded sequence of its elements (including
>>> prov:BundleConstructor) must be uniquely distinguished by JAXB. The
>>> repeating sequences are treated as a List<Object> of generic JAXBElements,
>>> where the JAXBElemnt's QName is used to distinguish elements with
>>> different names. So the culprit may be the unbounded sequence.
>>>
>>> <xs:sequence maxOccurs="unbounded">
>>> <xs:group ref="prov:documentElements" minOccurs="0"/>
>>> <xs:element name="bundleContent" type="prov:BundleConstructor"
>>> minOccurs="0"/>
>>> <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>> </xs:sequence>
>>>
>>>
>>>
>>>
>>> What if we move the unbounded occurrence into a wrapper complex type and
>>> keep the sequence singular? Below, I've introduced a "prov:DocumentBundle"
>>> wrapper complex type in which to apply the unbounded occurrence to. Then
>>> in "prov:DocumentBundle", maintain the same subelements as before, but as
>>> one occurrence of the sequence. Running it through JAXB now generates the
>>> cleaner prov-typed List elements. No customized bindings for JAXB needed.
>>> No removal of xsd:any needed.
>>>
>>>
>>> ------------------------------------------------------
>>> <xs:element name="document" type="prov:Document" />
>>>
>>> <xs:complexType name="DocumentBundle">
>>> <xs:sequence>
>>>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>    <xs:element name="bundleContent" type="prov:BundleConstructor"
>>> minOccurs="0"/>
>>>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> <xs:complexType name="Document">
>>> <xs:sequence>
>>>    <xs:element name="documentBundle" type="prov:DocumentBundle"
>>> minOccurs="0" maxOccurs="unbounded"/>
>>> </xs:sequence>
>>> </xs:complexType>
>>>
>>> ---------
>>>
>>> public class Document {
>>> protected DocumentBundle documentBundle;
>>> ...
>>>
>>> public class DocumentBundle {
>>> protected List<Entity> entity;
>>> protected List<Activity> activity;
>>> protected List<Generation> wasGeneratedBy;
>>> protected List<Usage> used;
>>> protected List<Communication> wasInformedBy;
>>> protected List<Start> wasStartedBy;
>>> protected List<End> wasEndedBy;
>>> protected List<Invalidation> wasInvalidatedBy;
>>> ...
>>> ------------------------------------------------------
>>>
>>> We could also rename the wrapper "prov:DocumentBundle" to something else
>>> reduce possible confusion with prov:Bundle and prov:BundleConstructor.
>>>
>>>
>>>
>>> I think we need to understand that this approach introduces another
>>> indirection artifact in the PROV-XML encoding. Would this be an acceptable
>>> compromise approach around the JAXBElement issue?
>>>
>>> --Hook
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On 3/21/13 8:49 AM, "Stephan Zednik" <zednis@rpi.edu> wrote:
>>>
>>>>
>>>> On Mar 21, 2013, at 5:26 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:
>>>>
>>>>> H Hook,
>>>>>
>>>>> Thanks for this analysis.
>>>>>
>>>>> In this specific instance, I think that it is the element
>>>>> <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>>> occurring inside
>>>>> <xs:sequence maxOccurs="unbounded">
>>>>> that causes these jaxb elements to be generated.
>>>>>
>>>>> If you were to remove xsd:any there, jaxbElements would no longer be
>>>>> generated.
>>>>>
>>>>> While we want to allow the possibility of elements from other schemas,
>>>>> do we
>>>>> really want to allow them any where inside a document/bundle?
>>>> We want to provide for elements from other schemas but I don't think we
>>>> formally identified what areas we intend to allow non-prov elements in
>>>> before we added this functionality to the schema.
>>>>
>>>> What if we made a FAQ entry about OXM mappings with PROV-XML and created
>>>> a customized schema or bindings file specifically for JAXB code
>>>> generation?  This would allow us to work on asynchronously with the
>>>> document and past the note publication, it would also allow us to
>>>> introduce JAXB-specific solutions that I do not think make sense in the
>>>> official schema or note.
>>>>
>>>> --Stephan
>>>>
>>>>> Luc
>>>>>
>>>>>
>>>>> On 03/21/2013 11:09 AM, Hua, Hook (388C) wrote:
>>>>>> Hi Luc,
>>>>>>
>>>>>> I'm using jaxb-ri-2.2.6 against our latest prov*.xsd and seeing
>>>>>> slightly
>>>>>> different bindings with JAXBElement:
>>>>>>
>>>>>> public class Document {
>>>>>>    @XmlElementRefs({
>>>>>>        @XmlElementRef(name = "hadPrimarySource", namespace =
>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>> false),
>>>>>>        @XmlElementRef(name = "agent", namespace =
>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>> false),
>>>>>>        @XmlElementRef(name = "activity", namespace =
>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>> false),
>>>>>>        @XmlElementRef(name = "organization", namespace =
>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>> false),
>>>>>>        @XmlElementRef(name = "softwareAgent", namespace =
>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>>>> false),
>>>>>> ....
>>>>>>
>>>>>>
>>>>>>
>>>>>> Some findings:
>>>>>>
>>>>>>
>>>>>> (1) JAXB's generation of JAXBElement<T> classes seems to be a wrapper
>>>>>> approach to preserve sufficient information in the schema for
>>>>>> round-trip
>>>>>> marshaling & unmarshalling of values in XML instances. More
>>>>>> specifically,
>>>>>> it wraps the data with a QName and a nillable flag [1].
>>>>>>
>>>>>> It appears that the a frequent cause of JAXB producing JAXBElement<T>
>>>>>> is
>>>>>> its attempt to preserve elements with both minOccurs=0 and
>>>>>> nillable=true.
>>>>>> JAXB needs to distinguish between the two cases where:
>>>>>>
>>>>>> a. element missing, minOccurs=0, then jaxbElement==null
>>>>>> b. element present, xsi:nil=true, then jaxbElement.isNil()==true
>>>>>>
>>>>>> It would not be possible to distinguish between these two states if the
>>>>>> bindings were the raw types.
>>>>>>
>>>>>>
>>>>>>
>>>>>> (2) It would be possible to customize the JAXB bindings [2] to ignore
>>>>>> the
>>>>>> full round-trip requirement. The "generateElementProperty=false"
>>>>>> customization option "can be used to generate an alternate developer
>>>>>> friendly but lossy binding" [3].
>>>>>>
>>>>>> I tried variations of a "bindings.xjb" customization file:
>>>>>>
>>>>>> <jaxb:bindings version="2.1"
>>>>>> xmlns:jaxb="http://java.sun.com/xml/ns/jaxb"
>>>>>> xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
>>>>>> xmlns:xs="http://www.w3.org/2001/XMLSchema">
>>>>>> <jaxb:bindings schemaLocation="prov-core.xsd"
>>>>>>    node="//xs:complexType[@name='Document']">
>>>>>>    <jaxb:globalBindings generateElementProperty="false" />
>>>>>> </jaxb:bindings>
>>>>>> </jaxb:bindings>
>>>>>>
>>>>>> $ xjc.sh -d BINDINGS -b bindings.xjb prov.xsd
>>>>>>
>>>>>> But none truly eliminated the JAXBElement<T> from the bindings.
>>>>>>
>>>>>>
>>>>>>
>>>>>> (3) Nowhere in our prov-core.xsd do we define minOccurs=0 in
>>>>>> conjunction
>>>>>> with nillable=true. In my attempts with JAXB, I'm seeing JAXBElements
>>>>>> appearing in the bindings for the (a) Document class and (b)
>>>>>> BundledConstructor class. Both types leverage the prov:documentElements
>>>>>> grouping.
>>>>>>
>>>>>> <xs:element name="document" type="prov:Document" />
>>>>>> <xs:complexType name="Document">
>>>>>> <xs:sequence maxOccurs="unbounded">
>>>>>>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>>>>    <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>>>> minOccurs="0"/>
>>>>>>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>>>> </xs:sequence>
>>>>>> </xs:complexType>
>>>>>>
>>>>>> It's unclear if there is some nillable-like affect that triggers JAXB
>>>>>> to
>>>>>> generate the JAXBElements.
>>>>>>
>>>>>>
>>>>>>
>>>>>> (4) On the upside, JAXB does provide an ObjectFactory class as part of
>>>>>> the
>>>>>> generated bindings that define creational factory methods to generate
>>>>>> the
>>>>>> JAXBElement instances. For example:
>>>>>>
>>>>>> public JAXBElement<Usage> createUsed(Usage value)
>>>>>>
>>>>>> Still, I agree that it is not as clean.
>>>>>>
>>>>>>
>>>>>> --Hook
>>>>>>
>>>>>>
>>>>>> [1] http://docs.oracle.com/javaee/5/api/javax/xml/bind/JAXBElement.html
>>>>>> [2]
>>>>>>
>>>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.5/tut
>>>>>> ori
>>>>>> al/doc/JAXBUsing4.html#wp148515
>>>>>> [3]
>>>>>>
>>>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/reference/tu
>>>>>> tor
>>>>>> ials/wsit/doc/DataBinding5.html
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 3/8/13 4:20 AM, "Provenance Working Group Issue Tracker"
>>>>>> <sysbot+tracker@w3.org> wrote:
>>>>>>
>>>>>>> PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML
>>>>>>> Serialization]
>>>>>>>
>>>>>>> http://www.w3.org/2011/prov/track/issues/648
>>>>>>>
>>>>>>> Raised by: Luc Moreau
>>>>>>> On product: XML Serialization
>>>>>>>
>>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I have ported the ProvToolbox and the ProvValidator to the new XML
>>>>>>> schema.
>>>>>>> I just wanted to report on my experience with the schema and JAXB.
>>>>>>> Obviously, others may have better experience with JAXB and may be able
>>>>>>> to help on some of the issues I encountered.
>>>>>>>
>>>>>>> Everything worked fine, except:
>>>>>>> - <xs:element ref="prov:internalElement abstract=true/>
>>>>>>> - extensibility <xs:any namespace="##other"/> in Document and Bundle
>>>>>>>
>>>>>>>
>>>>>>> These two constructs, while processable by JAXB, are not
>>>>>>> JAXB-friendly.
>>>>>>>
>>>>>>> Indeed, JAXB compiles the schema in a list containing all possible
>>>>>>> statements.
>>>>>>>
>>>>>>>   protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>>>>
>>>>>>> However, the presence on an abstract element and an <any/> element
>>>>>>> result
>>>>>>> in the
>>>>>>> content of that list to be of type:
>>>>>>>
>>>>>>>
>>>>>>>   @XmlElementRefs({
>>>>>>>       @XmlElementRef(name = "used", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>       @XmlElementRef(name = "wasAssociatedWith", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>       @XmlElementRef(name = "person", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>       @XmlElementRef(name = "entity", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>>>       @XmlElementRef(name = "wasInfluencedBy", namespace =
>>>>>>> "http://www.w3.org/ns/prov#"
>>>>>>> ....
>>>>>>>   })
>>>>>>>
>>>>>>>   @XmlAnyElement(lax = true)
>>>>>>>   protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>>>>
>>>>>>> where all data structures are wrapped up in this unpleasant
>>>>>>> JAXBElement.
>>>>>>>
>>>>>>> Without these features, we get a much more natural mapping:
>>>>>>>   @XmlElements({
>>>>>>>       @XmlElement(name = "entity", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = Entity.class),
>>>>>>>       @XmlElement(name = "activity", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = Activity.class),
>>>>>>>       @XmlElement(name = "wasGeneratedBy", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = WasGeneratedBy.class),
>>>>>>>       @XmlElement(name = "used", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = Used.class),
>>>>>>>       @XmlElement(name = "wasInformedBy", namespace =
>>>>>>> "http://www.w3.org/ns/prov#", type = WasInformedBy.class),
>>>>>>>   ...
>>>>>>> })
>>>>>>>
>>>>>>> So, how I did I solve the problem?  I inserted the extension schemas
>>>>>>> into
>>>>>>> the schema file, and hence got rid of the abstract element.  I am ok
>>>>>>> with
>>>>>>> this. We could possible provide the utility to that transformation.
>>>>>>>
>>>>>>> For the extensibility, I used a different definition. It happens to
>>>>>>> parse prov-xml compliant xml. When serializing, it  puts all
>>>>>>> extensibility elements at the end.  This is not a satisfactory
>>>>>>> solution, and is likely to be dependent of the jaxb implementation
>>>>>>> (though I am not entirely sure).
>>>>>>>
>>>>>>>
>>>>>>> <xs:complexType name="Document">
>>>>>>>    <xs:sequence>
>>>>>>>      <xs:choice maxOccurs="unbounded">
>>>>>>>        <xs:group ref="prov:documentElements"/>
>>>>>>>        <xs:element name="bundleContent" type="prov:NamedBundle"/>
>>>>>>>      </xs:choice>
>>>>>>>      <xs:any namespace="##other" processContents="lax" minOccurs="0"
>>>>>>> maxOccurs="unbounded"/>
>>>>>>>    </xs:sequence>
>>>>>>> </xs:complexType>
>>>>>>>
>>>>>>> Can something be done to make the XML schema a bit more jaxb friendly,
>>>>>>> while still keeping the same flexibility?  Thoughts welcome.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Luc
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>> -- 
>>>>> Professor Luc Moreau
>>>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>>>> University of Southampton          fax:   +44 23 8059 2865
>>>>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>>>>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm

Received on Thursday, 28 March 2013 16:43:53 UTC