Re: PROV-ISSUE-409 (prov-dm-review-LC): feedback on PROV-DM document (for last call release) [prov-dm]

Hi,

I realized that I didn't answer all the questions that were asked by the
editors in my review. You will find them below.

Thanks, khalid


   1. Can the document be released as a WD? Yes, provided that
   contextualization definition is amended in the light of the comments below.
   2. Can the documen*t* be released as a last call WD? Yes
   3. Renaming wasRevisionOf to wasRevisedFrom? It is fine with me either
   ways*
   *
   4. Primitive datatypes. Do we have to list them all? I think it would be
   good, but I wouldn't say they are mandatory.


> 2012/6/17 Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>
>
>> Hi,
>>
>> I read the new draft of the prov-dm. You will find my comments below.
>> Regarding the question of the editors about conceptualization. I am no
>> opposed to its presence in the DM, but its definition should be simplified
>> substantially (see the comments below).
>>
>> Regards, khalid
>>
>> -----------
>> - In the beginning of the document (PROC Family Specification), it is
>> stated that "PROV-O, the PROV ontology, an OWL-RL ontology allowing the
>> mapping of PROV to RDF". I am not sure that PROVO is entirely OWL-RL
>> compliant. We have been using in PROVO the term OWL-RL++, because there are
>> minor violation of OWL-RL in few places in the ontology.
>>
>> - In the Table of Content, the titles of Section 4.1 and Section 4.2 may
>> need to be detailed a bit more. As they are, they are not informative at
>> the level of the table of content, when the reader is browsing.
>>
>> - In the introduction, in the list that describes the components, there
>> is a mismatch between this list and the components in the table of
>> contents: according to the list in the introduction, component 2 is about
>> agents and component 3 is about derivations, whereas according to the table
>> of contents, component 2 is about derivations and component 3 is about
>> agents.
>>
>> - Section 2 is supposed to be an overview, but it is quite long.
>>
>> - Section 2 makes the difference between binary and expanded relations. I
>> am not sure this makes sense in the context of the DM. It was introduced in
>> PROV-O, because the language we are using is not expressive enough for
>> specifying n-ary relations in a natural way. This is not the case for
>> PROV-DM, PROV-N allow expressing such relations without a problem. Also,
>> reading the section on Expanded relations from the point of view of a
>> reader who is not part of the working group, it seems that this is a source
>> of confusion, and I don't see a real benefit from its presence in the DM.
>>
>> - In Section 2, when explaining "Usage", it is said that "Usage is the
>> beginning of utilizing an entity by an activity. Before usage, the activity
>> had not begun to utilize this entity and could not have been afected by the
>> entity.". this statement does not hold when an entity is used multiple
>> times by the same activity (e.g., to feed different parameters).
>>
>> - The discussion that follows Example 3, and explains that actually a car
>> is used and that another car is generated at the end of the journey is a
>> possible interpretation, and I don't think it is the more natural
>> interpretation. A simpler interpretation that the reader may grasp quickly,
>> is that the driving activity used a car, that's it. Not every activity
>> needs to generate an entity.
>>
>> - In Section 2.1.3, first paragraph: "more trustworthy that that from a
>> lobby organization" -> "more trustworthy than that from a lobby
>> organization"
>>
>> - In Section 2.1.3, in the statement about Delegation, it may be worth
>> specifying what is the scope of delegation, is the delegation valid for a
>> given activity or all activities carried out by the agent.
>>
>> - In example 13, "[...] but also determine who its provenance is
>> attributed to [...]". This sentence implies that an agent is always a
>> human. "who" can be replaced by "the agent" to avoid confusion.
>>
>> - The column "Core Structures", in Table 3, is confusing. components 1, 2
>> and 3 do not contain only core concepts.
>>
>> - In the UML diagram in Figure 5, as well as in other UML diagrams,
>> "attributes" is defined as a filed for Entity, Activity and others. Looking
>> just as the UML diagram, the reader may think that there is a filed called
>> attributes!
>>
>> - In the definition of communication, Section 5.1.5, it is stated that
>> "Communication is the exchange of an unspecified entity". Why do we require
>> that the entity should be unspecified. Aren't we restricting who may want
>> to specify the entity (or entities) exchanged between two activities to be
>> specified. I would suggest to rephrase that sentence in the  following
>> lines "Communication is the exchange of an entity that may be unspecified".
>>
>> - I notice that Invalidation (Section 5.1.8), is not present in Figure 5.
>>
>>
>> - In section 5.2 (Component 2: Derivations), the first sentence in this
>> section says "The third component".
>>
>> - I find the definition of "Primary Source", hard to follow. Can we
>> simplify it?
>>
>> - In the definition of delegation, the activity is an optional argument.
>> What is the semantics of delegation when the activity is not specified. I
>> suspect that it means that the activity for which the delegation holds is
>> unknown. However, the reader may think that the delegation hold for all the
>> activities that are carried out by the agent in question.
>>
>> - The first paragraph, 3rd sentence, in Section 5.4, "It comprises a
>> Bundle class and a subclass of Entity"-> "It specifies that Bundle is a
>> sub-class of Entity".
>>
>>  - The first sentence in Example 40 states that "A provenance aggregator
>> could merge two bundles". the verb merge has a strong semantics that does
>> not applies in this case. I think we could simply say "could union"?
>>
>> - Section 5.5.3 on contextualization is difficult to follow. The third
>> paragraph in this section states that "A bundle's description provide a
>> context in which to interpret an entity in a domain-specific manner".  This
>> is not reflected in the definition of bundle, which form my understanding,
>> aggregate a number of provenance descriptions that happen (by accident) to
>> be in a bundle, e.g., a file. The notion of context and domain dependency
>> introduced in contextualization seems to assume that a bundle contains
>> provenance description within the bundle are domain dependent and that they
>> have been specified within a given context. The notion of context is also
>> loose, and cam mean different things to different people.
>>
>> Now, looking at example 45, it may be that what the first paragraphs in
>> Section 5.5.3 are misleading, and that the purpose is to have something
>> simple. If the objective is basically to specify that a given entity e1 is
>> a specialization of another entity e2 and to be able to locate the bundle
>> in which e2 is described, then we should just do that. In other words, we
>> should use "specializationOf", and add a construct that specify the bundle
>> in which a given entity is described, e.g., isDescribedIn(e2,bundle2)?
>>
>> Therefore, to answer the question that the editor asked regarding
>> contextualization, I do not oppose its presence in the DM, but I think it
>> definition should be simplified substantially to reflect the way it will be
>> used in practice. I would also urge the editors to avoid using the term
>> contextualization as it is vague.
>>
>> - In section 5.6.1, it is stated that collection is a multiset because it
>> may not be possible to verify that two distinct entity identifiers do not
>> denote the same entity. This is one reason, but not the main one.
>> Collection is a general contruct, and we should allow people to contruct
>> collections that contains duplicate entities with different or same
>> identifiers.
>>
>>
>> On 14 June 2012 12:07, Provenance Working Group Issue Tracker <
>> sysbot+tracker@w3.org> wrote:
>>
>>> PROV-ISSUE-409 (prov-dm-review-LC): feedback on PROV-DM document (for
>>> last call release) [prov-dm]
>>>
>>> http://www.w3.org/2011/prov/track/issues/409
>>>
>>> Raised by: Luc Moreau
>>> On product: prov-dm
>>>
>>>
>>> This is the issue to collect feedback on the prov-dm document.
>>>
>>> Document to review is available from:
>>>
>>>
>>> http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120614/prov-dm.html
>>>
>>> Question for reviewers:
>>> http://www.w3.org/2011/prov/wiki/Meetings:Telecon2012.06.14
>>>
>>> Cheers,
>>> Luc
>>>
>>>
>>>
>>>
>>
>

Received on Monday, 18 June 2012 16:22:22 UTC