Re: PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML Serialization]

Hi Hook,
Thanks for trying to solve this issue.
Can you give an example of what a document containing a bundle would 
look like?
Luc


On 03/28/2013 10:40 AM, Hua, Hook (388C) wrote:
> Hi Luc and Stephan,
>
> Somehow, with jaxb-ri-2.2.6, the removal of xsd:any still generates
> JAXBElements.
>
> Hopefully we may not need to modify the xsd:any support nor use customized
> bindings mapping for JAXB. In looking into it further, I believe I have
> found a more upstream cause and a potentially cleaner solution.
>
> Given that we have the following unfriendly XML binding mapping:
>
> ------------------------------------------------------
> <xs:element name="document" type="prov:Document" />
>
> <xs:complexType name="Document">
>    <xs:sequence maxOccurs="unbounded">
>      <xs:group ref="prov:documentElements" minOccurs="0"/>
>      <xs:element name="bundleContent" type="prov:BundleConstructor"
> minOccurs="0"/>
>      <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>    </xs:sequence>
>    </xs:complexType>
>
> ---------
>
>
> public class Document {
>    @XmlElementRefs({
>    @XmlElementRef(name = "wasRevisionOf", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>    @XmlElementRef(name = "activity", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>    @XmlElementRef(name = "collection", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>    @XmlElementRef(name = "bundle", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>    @XmlElementRef(name = "wasQuotedFrom", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>    @XmlElementRef(name = "wasInvalidatedBy", namespace =
> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required = false),
>    ...
>    })
>    @XmlAnyElement(lax = true)
>    protected List<Object> entityAndActivityAndWasGeneratedBy;
>    ...
> ------------------------------------------------------
>
> It appears that to help retain a round-trip marshalling/unmarshalling of
> our prov:Document, the unbounded sequence of its elements (including
> prov:BundleConstructor) must be uniquely distinguished by JAXB. The
> repeating sequences are treated as a List<Object> of generic JAXBElements,
> where the JAXBElemnt's QName is used to distinguish elements with
> different names. So the culprit may be the unbounded sequence.
>
> <xs:sequence maxOccurs="unbounded">
>    <xs:group ref="prov:documentElements" minOccurs="0"/>
>    <xs:element name="bundleContent" type="prov:BundleConstructor"
> minOccurs="0"/>
>    <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>    </xs:sequence>
>
>
>
>
> What if we move the unbounded occurrence into a wrapper complex type and
> keep the sequence singular? Below, I've introduced a "prov:DocumentBundle"
> wrapper complex type in which to apply the unbounded occurrence to. Then
> in "prov:DocumentBundle", maintain the same subelements as before, but as
> one occurrence of the sequence. Running it through JAXB now generates the
> cleaner prov-typed List elements. No customized bindings for JAXB needed.
> No removal of xsd:any needed.
>
>
> ------------------------------------------------------
> <xs:element name="document" type="prov:Document" />
>
>    <xs:complexType name="DocumentBundle">
>    <xs:sequence>
>      <xs:group ref="prov:documentElements" minOccurs="0"/>
>      <xs:element name="bundleContent" type="prov:BundleConstructor"
> minOccurs="0"/>
>      <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>    </xs:sequence>
>    </xs:complexType>
>
>    <xs:complexType name="Document">
>    <xs:sequence>
>      <xs:element name="documentBundle" type="prov:DocumentBundle"
> minOccurs="0" maxOccurs="unbounded"/>
>    </xs:sequence>
>    </xs:complexType>
>
> ---------
>
> public class Document {
>    protected DocumentBundle documentBundle;
>    ...
>
> public class DocumentBundle {
>    protected List<Entity> entity;
>    protected List<Activity> activity;
>    protected List<Generation> wasGeneratedBy;
>    protected List<Usage> used;
>    protected List<Communication> wasInformedBy;
>    protected List<Start> wasStartedBy;
>    protected List<End> wasEndedBy;
>    protected List<Invalidation> wasInvalidatedBy;
>    ...
> ------------------------------------------------------
>
> We could also rename the wrapper "prov:DocumentBundle" to something else
> reduce possible confusion with prov:Bundle and prov:BundleConstructor.
>
>
>
> I think we need to understand that this approach introduces another
> indirection artifact in the PROV-XML encoding. Would this be an acceptable
> compromise approach around the JAXBElement issue?
>
> --Hook
>
>
>
>
>
>   
>
>
> On 3/21/13 8:49 AM, "Stephan Zednik" <zednis@rpi.edu> wrote:
>
>>
>> On Mar 21, 2013, at 5:26 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk> wrote:
>>
>>> H Hook,
>>>
>>> Thanks for this analysis.
>>>
>>> In this specific instance, I think that it is the element
>>>   <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>> occurring inside
>>>   <xs:sequence maxOccurs="unbounded">
>>> that causes these jaxb elements to be generated.
>>>
>>> If you were to remove xsd:any there, jaxbElements would no longer be
>>> generated.
>>>
>>> While we want to allow the possibility of elements from other schemas,
>>> do we
>>> really want to allow them any where inside a document/bundle?
>> We want to provide for elements from other schemas but I don't think we
>> formally identified what areas we intend to allow non-prov elements in
>> before we added this functionality to the schema.
>>
>> What if we made a FAQ entry about OXM mappings with PROV-XML and created
>> a customized schema or bindings file specifically for JAXB code
>> generation?  This would allow us to work on asynchronously with the
>> document and past the note publication, it would also allow us to
>> introduce JAXB-specific solutions that I do not think make sense in the
>> official schema or note.
>>
>> --Stephan
>>
>>> Luc
>>>
>>>
>>> On 03/21/2013 11:09 AM, Hua, Hook (388C) wrote:
>>>> Hi Luc,
>>>>
>>>> I'm using jaxb-ri-2.2.6 against our latest prov*.xsd and seeing
>>>> slightly
>>>> different bindings with JAXBElement:
>>>>
>>>> public class Document {
>>>>      @XmlElementRefs({
>>>>          @XmlElementRef(name = "hadPrimarySource", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>          @XmlElementRef(name = "agent", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>          @XmlElementRef(name = "activity", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>          @XmlElementRef(name = "organization", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>>          @XmlElementRef(name = "softwareAgent", namespace =
>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class, required =
>>>> false),
>>>> ....
>>>>
>>>>
>>>>
>>>> Some findings:
>>>>
>>>>
>>>> (1) JAXB's generation of JAXBElement<T> classes seems to be a wrapper
>>>> approach to preserve sufficient information in the schema for
>>>> round-trip
>>>> marshaling & unmarshalling of values in XML instances. More
>>>> specifically,
>>>> it wraps the data with a QName and a nillable flag [1].
>>>>
>>>> It appears that the a frequent cause of JAXB producing JAXBElement<T>
>>>> is
>>>> its attempt to preserve elements with both minOccurs=0 and
>>>> nillable=true.
>>>> JAXB needs to distinguish between the two cases where:
>>>>
>>>>    a. element missing, minOccurs=0, then jaxbElement==null
>>>>    b. element present, xsi:nil=true, then jaxbElement.isNil()==true
>>>>
>>>> It would not be possible to distinguish between these two states if the
>>>> bindings were the raw types.
>>>>
>>>>
>>>>
>>>> (2) It would be possible to customize the JAXB bindings [2] to ignore
>>>> the
>>>> full round-trip requirement. The "generateElementProperty=false"
>>>> customization option "can be used to generate an alternate developer
>>>> friendly but lossy binding" [3].
>>>>
>>>> I tried variations of a "bindings.xjb" customization file:
>>>>
>>>> <jaxb:bindings version="2.1"
>>>>    xmlns:jaxb="http://java.sun.com/xml/ns/jaxb"
>>>>    xmlns:xjc="http://java.sun.com/xml/ns/jaxb/xjc"
>>>>    xmlns:xs="http://www.w3.org/2001/XMLSchema">
>>>>    <jaxb:bindings schemaLocation="prov-core.xsd"
>>>>      node="//xs:complexType[@name='Document']">
>>>>      <jaxb:globalBindings generateElementProperty="false" />
>>>>    </jaxb:bindings>
>>>> </jaxb:bindings>
>>>>
>>>> $ xjc.sh -d BINDINGS -b bindings.xjb prov.xsd
>>>>
>>>> But none truly eliminated the JAXBElement<T> from the bindings.
>>>>
>>>>
>>>>
>>>> (3) Nowhere in our prov-core.xsd do we define minOccurs=0 in
>>>> conjunction
>>>> with nillable=true. In my attempts with JAXB, I'm seeing JAXBElements
>>>> appearing in the bindings for the (a) Document class and (b)
>>>> BundledConstructor class. Both types leverage the prov:documentElements
>>>> grouping.
>>>>
>>>>    <xs:element name="document" type="prov:Document" />
>>>> <xs:complexType name="Document">
>>>>    <xs:sequence maxOccurs="unbounded">
>>>>      <xs:group ref="prov:documentElements" minOccurs="0"/>
>>>>      <xs:element name="bundleContent" type="prov:BundleConstructor"
>>>> minOccurs="0"/>
>>>>      <xs:any namespace="##other" processContents="lax" minOccurs="0" />
>>>>    </xs:sequence>
>>>>    </xs:complexType>
>>>>
>>>> It's unclear if there is some nillable-like affect that triggers JAXB
>>>> to
>>>> generate the JAXBElements.
>>>>
>>>>
>>>>
>>>> (4) On the upside, JAXB does provide an ObjectFactory class as part of
>>>> the
>>>> generated bindings that define creational factory methods to generate
>>>> the
>>>> JAXBElement instances. For example:
>>>>
>>>>    public JAXBElement<Usage> createUsed(Usage value)
>>>>
>>>> Still, I agree that it is not as clean.
>>>>
>>>>
>>>> --Hook
>>>>
>>>>
>>>> [1] http://docs.oracle.com/javaee/5/api/javax/xml/bind/JAXBElement.html
>>>> [2]
>>>>
>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/docs/1.5/tut
>>>> ori
>>>> al/doc/JAXBUsing4.html#wp148515
>>>> [3]
>>>>
>>>> http://docs.oracle.com/cd/E17802_01/webservices/webservices/reference/tu
>>>> tor
>>>> ials/wsit/doc/DataBinding5.html
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 3/8/13 4:20 AM, "Provenance Working Group Issue Tracker"
>>>> <sysbot+tracker@w3.org> wrote:
>>>>
>>>>> PROV-ISSUE-648: Can schema be made a bit more jaxb friendly? [XML
>>>>> Serialization]
>>>>>
>>>>> http://www.w3.org/2011/prov/track/issues/648
>>>>>
>>>>> Raised by: Luc Moreau
>>>>> On product: XML Serialization
>>>>>
>>>>>
>>>>> Hi
>>>>>
>>>>> I have ported the ProvToolbox and the ProvValidator to the new XML
>>>>> schema.
>>>>> I just wanted to report on my experience with the schema and JAXB.
>>>>> Obviously, others may have better experience with JAXB and may be able
>>>>> to help on some of the issues I encountered.
>>>>>
>>>>> Everything worked fine, except:
>>>>> - <xs:element ref="prov:internalElement abstract=true/>
>>>>> - extensibility <xs:any namespace="##other"/> in Document and Bundle
>>>>>
>>>>>
>>>>> These two constructs, while processable by JAXB, are not
>>>>> JAXB-friendly.
>>>>>
>>>>> Indeed, JAXB compiles the schema in a list containing all possible
>>>>> statements.
>>>>>
>>>>>     protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>>
>>>>> However, the presence on an abstract element and an <any/> element
>>>>> result
>>>>> in the
>>>>> content of that list to be of type:
>>>>>
>>>>>
>>>>>     @XmlElementRefs({
>>>>>         @XmlElementRef(name = "used", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>         @XmlElementRef(name = "wasAssociatedWith", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>         @XmlElementRef(name = "person", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>         @XmlElementRef(name = "entity", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = JAXBElement.class),
>>>>>         @XmlElementRef(name = "wasInfluencedBy", namespace =
>>>>> "http://www.w3.org/ns/prov#"
>>>>> ....
>>>>>     })
>>>>>
>>>>>     @XmlAnyElement(lax = true)
>>>>>     protected List<Object> entityAndActivityAndWasGeneratedBy;
>>>>>
>>>>> where all data structures are wrapped up in this unpleasant
>>>>> JAXBElement.
>>>>>
>>>>> Without these features, we get a much more natural mapping:
>>>>>     @XmlElements({
>>>>>         @XmlElement(name = "entity", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = Entity.class),
>>>>>         @XmlElement(name = "activity", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = Activity.class),
>>>>>         @XmlElement(name = "wasGeneratedBy", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = WasGeneratedBy.class),
>>>>>         @XmlElement(name = "used", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = Used.class),
>>>>>         @XmlElement(name = "wasInformedBy", namespace =
>>>>> "http://www.w3.org/ns/prov#", type = WasInformedBy.class),
>>>>>     ...
>>>>> })
>>>>>
>>>>> So, how I did I solve the problem?  I inserted the extension schemas
>>>>> into
>>>>> the schema file, and hence got rid of the abstract element.  I am ok
>>>>> with
>>>>> this. We could possible provide the utility to that transformation.
>>>>>
>>>>> For the extensibility, I used a different definition. It happens to
>>>>> parse prov-xml compliant xml. When serializing, it  puts all
>>>>> extensibility elements at the end.  This is not a satisfactory
>>>>> solution, and is likely to be dependent of the jaxb implementation
>>>>> (though I am not entirely sure).
>>>>>
>>>>>
>>>>>    <xs:complexType name="Document">
>>>>>      <xs:sequence>
>>>>>        <xs:choice maxOccurs="unbounded">
>>>>>          <xs:group ref="prov:documentElements"/>
>>>>>          <xs:element name="bundleContent" type="prov:NamedBundle"/>
>>>>>        </xs:choice>
>>>>>        <xs:any namespace="##other" processContents="lax" minOccurs="0"
>>>>> maxOccurs="unbounded"/>
>>>>>      </xs:sequence>
>>>>>    </xs:complexType>
>>>>>
>>>>> Can something be done to make the XML schema a bit more jaxb friendly,
>>>>> while still keeping the same flexibility?  Thoughts welcome.
>>>>>
>>>>> Cheers,
>>>>> Luc
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>> -- 
>>> Professor Luc Moreau
>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>> University of Southampton          fax:   +44 23 8059 2865
>>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>>
>>>
>>>
>>

-- 
Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm

Received on Thursday, 28 March 2013 16:38:40 UTC