RE: [all] call for concensus on Translation Provenance Agent (related to ISSUE-22) from Des Oates on 2012-07-26 (public-multilingualweb-lt@w3.org from July 2012)

From: Des Oates <doates@adobe.com>
Date: Thu, 26 Jul 2012 12:30:59 +0100
To: Felix Sasaki <fsasaki@w3.org>, Dave Lewis <dave.lewis@cs.tcd.ie>
CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
Message-ID: <7B8D77012FE36343856B6DE17A307DD284C0ADEF15@eurmbx01.eur.adobe.com>
Hi folks

I'm still playing catch-up to try to get up to speed with the great  progress being made on the spec, but I've a few comments on the Provenance agent that I'd like to add into the mix.

1) 'trans'Agent versus 'prov'. I don't believe we should be limiting provenance specifically to translation related activities, so I'd be in favour of using provAgent, rather than transAgent, such that it can be applicable in other domains rather than be exclusive to translation activity.
E.g. source text pre processing, target text post processesing, document/data format/encoding conversion processes, metric calculation processes etc.
All of these activities may exist in a translation workflow, but are not specific to translation per se.

2) Related to that above, I'm not sure if it is sufficient to have provenance information related specifically to translation creation, and translation revision. To me this couples the category too tightly to a small subset of activities at the potential exclusion of other activities in the complete roundtrip process.

So although it adds potential additional attribute to the markup, I'd be in favour of an just  'provAgent' to indicate provenamce, with an optional field of agentType or similar that I believe (apologies if I am wrong) Felix suggested earlier in the thread.


Des



From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: 26 July 2012 11:34
To: Dave Lewis
Cc: public-multilingualweb-lt@w3.org
Subject: Re: [all] call for concensus on Translation Provenance Agent (related to ISSUE-22)

Hi Dave,

you are right about the rule precedence, good point. A question about the separation "transAgent" vs. "revisionAgent" in general: is it important to specify the order, e.g. who did the first revision, he second one etc?

A few more questions about the URIs for in the "transAgentRef" and "transRevisionAgentRef" attributes:

1) Do you say anything about the type of information to be expected, e.g.. machine readable or human readable information? E.g. for "locnote" we focus on examples with human readable information, also in the "ref" attributes; but in your examples you have the "mailto" scheme. How can an application know what is expected here, or do you have "best practices" what kind of machine readable information should be provided?

2) In the transAgent / transAgentRef attributes, several values are possible. But does it really make sense to have a transAgentRef without a transAgent? Same for the revision agent. So instead of "at least of the following", you could say: at least one of the following: a transAgent attribute with an optional transAgentRef, or a transRevisionAgent attribute with an optional transRevisionAgentRef attribute.

3) I would also locally say that the agent and the "ref" attributes MUST appear at the same node, and that the "agent" attribute is mandatory. Otherwise, you run into trouble with complex inheritance rules: what is overriding what?

3) Another option to make things clearer globally would be two rules elements: one <its:transAgentRule>, one <its:revisionAgentRule>, again with optional "ref" attributes. That would also more directly reflect the local approach.

4) Is the order of the comma separated values in the attributes significant, and what happens if a value is missing? In the local example you have C3PO as transAgent and these URIs as transAgentRef: mailto:locutus@b.org<mailto:locutus@b.org> http://www.thecollective.org
does this mean that both relate to C3PO, or for the 2nd URI, there is just no transAgent given? Again, it sounds like making the agent attribute mandatory and having the ref attributes optional would lower the number of choices and increase interop.

Felix

2012/7/26 Dave Lewis <dave.lewis@cs.tcd.ie<mailto:dave.lewis@cs.tcd.ie>>
On 26/07/2012 08:01, Felix Sasaki wrote:
P.S.: having just "agent" has of course the drawback that you need more rule elements to express the same information.
However, it has the benefit that you can be more specific wrt optionality of attributes: currently, all "agent" related attributes are attributes, so this

You mean all 'attributes are optional' right? Yes, that's a good point. I wasn't sure about the correct formulation for this and just took the lead from the rubyRule where all the attributes are also optional, but you are right this leaves the meaningless option of having no attribute for agent (I'm not sure if the same is a problem for ruby).

Would a better formulation would be the following?

 *   A required selector attribute. It contains an XPath expression which selects the nodes to which this rule applies.
 *   At least one of the following:

    *   A transAgent attribute that contains one or more comma separated strings, each one identifying a different translation agent.
    *   A transAgentRef attribute that contains one or more space-separated IRI, each referring to a resource that identifies a different translation agent.
    *   A transRevisionAgent attribute that contains one or more comma separated strings, each one identifying a different translation revision agent.
    *   A transRevisionAgentRef attribute that contains one or more space-separated IRI, each referring to a resource that identifies a different translation revision agent.


<its:agentRule selector="/html/body/par"/>
would be legal, but doesn't make sense. If you have just the "agent" attribute and "agentRef", you can say that both (or just the former?) are mandatory - also the "agentType" attribute.

Felix

cheers,
Dave




2012/7/26 Felix Sasaki <fsasaki@w3.org<mailto:fsasaki@w3.org>>
Hi Dave, all,

About

"Two types of Translation Provenance Agent data categories are needed to identify:"

and the data category in general: wouldn't it be possible to have just two attributes "agent" and "agentRef", and an additional one "type" with the values "transAgent" or "revisionAgent"? In that they there are less attributes and also less pointer attributes (see Yves' comment). It would look like this I think:

<its:agentRule selector="/html/body/par" its:agentRef="http://www.onlinemtexample.com/2012/7/25/legal-v1/wsdl/"  type="transAgent" />

<its:agentRule selector="/html/body/par" agent="John Doe, acme-CAT-v2.3" type="revisionAgent"/>


Small editorial thing: your examples above said "its:domainRule", I changed that to "agentRule".

Another note: in ITS global rules, we always used attributes without a namespace, e.g. "agents" instead of "its:agents".


Felix


2012/7/25 Dave Lewis <dave.lewis@cs.tcd.ie<mailto:dave.lewis@cs.tcd.ie>>
Hi all,
Given the implementation commitment to provenance and the previous posting on this subject, http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Jun/0161.html please find attached the proposed specification for the Translation Provenance Agent plus the example files.

As a reminder, and as discussed in the original post and mentioned at the last WG call, provenance covers two essentially independent approaches: agent provenance, (which is this one), and standoff provenance, which we are treating as two individual data categories. I will send on the standoff provenance call for concensus shortly.

Regards,
Dave




--
Felix Sasaki
DFKI / W3C Fellow




--
Felix Sasaki
DFKI / W3C Fellow





--
Felix Sasaki
DFKI / W3C Fellow
Received on Thursday, 26 July 2012 11:31:41 UTC