RE: Call for consensus - Localization Quality Précis (related to [ISSUE-34]) from Des Oates on 2012-08-24 (public-multilingualweb-lt@w3.org from August 2012)

From: Des Oates <doates@adobe.com>
Date: Fri, 24 Aug 2012 12:37:08 +0100
To: Felix Sasaki <fsasaki@w3.org>, Yves Savourel <ysavourel@enlaso.com>
CC: "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>
Message-ID: <7B8D77012FE36343856B6DE17A307DD284C0CB5A99@eurmbx01.eur.adobe.com>
Hi Felix,

I think the inline score mapping you've illustrated here may be useful to include, but I believe you are talking about a different community scoring system than I am.

In my limited understanding of community scoring mechanisms in general there are two paradigms:

Rating - which uses a finite range of scalar values (E.g.  0-5 stars used by Amazon, and ITunes )
Voting - which uses plus and minus increments to create a positive or negative aggregate 'score' value  (E.g. 'Thumbs up/down' used in YouTube comments)

I believe you are referring to a Rating system in your example below.

However, Community Translation scoring systems tend to use the latter 'Voting' system. Facebook does for sure and so do we (Adobe).   I uploaded an example from Facebook's Translation app here : http://twitpic.com/an3z66

This particular segment has 3 translation candidates that users can vote on. Users can click either the checkbox (+ve vote) or the 'x' (-ve vote).  The aggregate of all users' votes for each candidate translation represents the 'quality score' for that candidate.  It is this segment-level aggregate score information I see as being a good fit for the locQualityPrecisScore* data category, but obviously it can have both negative and positive values.

However none of this necessarily precludes this use of mapping rules that you propose. The use cases may be different though. This would be useful in a Moderator (~= Translation Review) workflow where a moderator is presented candidates and has to select the best one to take.
A system may employ logic to filter the candidate based on their score such that only the best candidates get presented to the moderator.
So your example could become something like:
<its:translateRule selector="//*[@translation-need &lt; 0]" translate="no"/>
Which could be interpreted by a system to mean 'Do not present any candidates to the moderator that have a score under  0'.

I don't wish to labour the point with this, and slow the process down in any way, but I do believe negative scoring should be considered for this (even in a finite range), as there is an established use case to support it.

Regards
Des


From: Felix Sasaki [mailto:fsasaki@w3.org]
Sent: 24 August 2012 09:50
To: Yves Savourel; Des Oates
Cc: public-multilingualweb-lt@w3.org
Subject: Re: Call for consensus - Localization Quality Précis (related to [ISSUE-34])

Hi Yves, all,

we had similar issues during the development of ITS 1.0 before: what if existing information related to translation is not expressed via "yes" or "no", but via other and maybe even more values? With the global rules mechanism, you can do something like this:

<its:translateRule selector="//*[@translation-need &lt;= 0.5]" translate="no"/>
<its:translateRule selector="//*[@translation-need &gt;= 0.5]" translate="yes"/>

This assumes that in the "translation-need" attribute values below or equal 0.5 carry the same semantics as ITS translate "no". The same for more than 0.5 and translate "yes".

Now, I assume that in community based workflows the situation would be similar: you would have a small set of values (e.g. 0-5). These could be mapped to the score we envisage, with six global rules:

<its:locQualityScoreRuleselector="/doc[@my-own-score=5]" locQualityPrecisScore="100"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=4]" locQualityPrecisScore="80"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=3]" locQualityPrecisScore="60"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=2]" locQualityPrecisScore="40"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=1]" locQualityPrecisScore="20"/>
<its:locQualityScoreRuleselector="/doc[@my-own-score=0]" locQualityPrecisScore="0"/>

So if the main type of workflows you have in mind are with values like above, I think both requirements (fixed set of values in ITS and different values in input data) could be fulfilled.

Best,

Felix
2012/8/23 Yves Savourel <ysavourel@enlaso.com<mailto:ysavourel@enlaso.com>>
Hi Des, all,

> ...Can I provide my own scoring range if I supply my
> own locQualityPrecisProfileRef?
> ...
> In summary, I don't think it makes sense to constrain
> the values of locQualityPreciseScore* if a user
> provides a locQualityPrecisProfile* that can provide
> semantic meaning to the score values that lie
> outside the [0-100] range.
This looks like the ever-problematic issue of providing either perfect interoperability for few or partial for all.

I would agree with you if locQualityPrecisProfile* was pointing to a standardized resource where the user agent, *without other knowledge* could parse to discover what to expect as value/range.

But without a standard profile, if we allow to have different types of values based on just a URI, only the tools with a specific knowledge of what that URI means for the values will be able to interact with the values. The other will have no clue.

The system with a simple 0.0 to 100.0 range is obviously also flawed because one has to map the original values to a given range. That that may be tricky.

How would you map negative/positive N to 0-100?
I'm guessing there has to be some high and low limit, even if it's MAXINT.

In that case you could use a function such as the one we use in Okapi:

/**
 * Given an integer range and a value normalize that value on a scale between 0 and 100.
 * @param low - lowest value of the range
 * @param high - highest value of the range
 * @param value - the value that needs to be mapped to 0-100
 * @return mapped value, a number between 0 and 100
 */
Public static int normalizeRange(float low, float high, float value) {
        float m = 0.0f; // low value  of map to range
        float n = 100.0f; // high value of map to range
        return (int) (m +((value-low)/(high-low) * (n-m)));
}

Obviously we are losing precision if the original range is wider than 100. For example for an original range of -100 to 100 you get:

normalizeRange(-100, 100, -15) == 42
normalizeRange(-100, 100, -16) == 42

going to a decimal value would allow more precision.


The other issue is that mapping back from 0-100 to negative/positive N is not going to work perfectly.

But the idea is that all tools will get a meaningful value. Yes, with a loss of precision in some cases, but I would expect that for Score this is acceptable.

This said, having the profile declared should help tools knowledgeable of that profile to

My suggestion would be to define your own attribute in addition to locQualityPrecisScore, with the native value.


a) If you have your own value in the ITS score attribute, with your profile declared:

- tools knowing about your system can use the value directly.
- tools not knowing about your system cannot use the value safely.


b) If you have only the standardized ITS score and your profile declared:

- tools knowing about your system can use the ITS value.
- tools knowing about your system can (in some cases) map the ITS value back to the native one.
- tools not knowing about your system can use an ITS value that is meaningful.
- tools not knowing about your system can modify the ITS value in a way that can be (for some system) mappable back to your system.


c) If you have both the standardized ITS score and your own value and your profile declared:

- tools knowing about your system can use the ITS value.
- tools knowing about your system can use the real native value.
- tools knowing about your system can update both values properly.
- tools not knowing about your system can use an ITS value that is meaningful.
- tools not knowing about your system can modify the ITS value in a way that can be (for some system) mappable back to your system.

Writing this email, made me think: maybe there is a range other than 0.0-100.0 that is more suitable for mapping back and forth to other ranges? Maybe -100 to 100? Anyone good in math has a suggestion?

I don't necessarily want 0-100, but I think having a standardized range is important.

Cheers,
-yves





--
Felix Sasaki
DFKI / W3C Fellow
Received on Friday, 24 August 2012 11:37:49 UTC