RE: Domain mapping implementation

Hi Yves.

 

Regarding the removing duplicated values processing step.

 

I see two places where it could be done (they are not mutually exclusive):

1.- First, remove the possible duplicated values of the input domain string. That is applicable, for example, for the comma separated values in a meta keywords HTML tag.

2.- Remove the possible duplicated values of the resulting output domain string, after applying the mappings.

 

In the domain mapping pair values supposedly there is not duplicates because the specification says: “The left part of the pair is part of the source content and unique within the mapping.”

 

Cheers.

__________________________________

Mauricio del Olmo Martínez

Dpto. Técnico/I+D+i

Linguaserve Internacionalización de Servicios, S.A.

Tel.: +34 91 761 64 60 ext. 0421
Fax: +34 91 542 89 28 

E-mail:  <mailto:tecnico@linguaserve.com> tecnico@linguaserve.com

 <http://www.linguaserve.com/> www.linguaserve.com

 

«En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley 34/2002, de 11 de julio, de Servicios de la Sociedad de Información y Comercio Electrónico, le informamos que procederemos al archivo y tratamiento de sus datos exclusivamente con fines de promoción de los productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN DE SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al archivo y tratamiento de los datos proporcionados, o no deseen recibir comunicaciones comerciales sobre los productos y servicios ofrecidos, comuníquenoslo a  <mailto:clients@linguaserve.com> clients@linguaserve.com, y su petición será inmediatamente cumplida.»

 

"According to the provisions set forth in articles 21 and 22 of Law 34/2002 of July 11 regarding Information Society and eCommerce Services, we will store and use your personal data with the sole purpose of marketing the products and services offered by LINGUASERVE INTERNACIONALIZACIÓN DE SERVICIOS, S.A. If you do not wish your personal data to be stored and handled, or you do not wish to receive further information regarding products and services offered by our company, please e-mail us to  <mailto:clients@linguaserve.com> clients@linguaserve.com. Your request will be processed immediately.”

__________________________________

-----Mensaje original-----
De: Yves Savourel [mailto:ysavourel@enlaso.com] 
Enviado el: martes, 30 de octubre de 2012 12:06
Para: public-multilingualweb-lt@w3.org
Asunto: RE: Domain mapping implementation

 

Hi Mauricio,

 

Good catch on the U+0023 value. I’ll change it to U+0022.

 

For the domain list:

 

I didn’t put any provision in the spec about quotation marks because the description of the HTML5 meta value does not use them.

See  <http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#standard-metadata-names> http://www.w3.org/TR/2011/WD-html5-20110525/semantics.html#standard-metadata-names

 

For example it would be: content="value 1,value 2, value 3" rather than content="'value 1','value 2','value 3'"

Based on the algorithm provided by HTML5 I'm assuming you would get two different results for the two entries above. But I may be wrong. 

 

I'm fine with having some provision for quotation marks, but we should probably check what's the HTML behavior. What list of tokens we get for the second case. Maybe Jirka or Shaun know that?

 

That also means we should probably have a text case with quotes.

 

Another aspect the current algorithm in the specification does not take into account is duplicated values. The HTML5 algorithm does remove duplicated values, so we should probably too.

 

Cheers,

-yves

 

 

 

From: Mauricio del Olmo [mailto:mauricio.delolmo@linguaserve.com]

Sent: Tuesday, October 30, 2012 3:42 AM

To: 'Felix Sasaki'; Yves Savourel

Cc: public-multilingualweb-lt@w3.org

Subject: Domain mapping implementation

 

Hi Yves.

 

Related to the domain mapping algorithm implementation, I tested it yesterday and it works.

 

I don’t do exactly the same because I load the correspondences in a map object first and then I check if there is a value for the source/original domain, but it is based on what you described.

That would be applied for each domain mapping occurrence.

More or less the following algorithm:

1. Set the default value for the output domain with the input (source or original) domain:

1. If the node value contains a COMMA (U+002C): 

1. Split the node value into separate strings using the COMMA (U+002C) as separator.

2. For each string:

1. Trim the leading and trailing white spaces of the string.

2. Check if there is a delimited value in the string (apostrophes or quotation marks):

1. If one is found:

1. Split/substring the value to obtain separately the original and target domain values.

2. Trim each source and target domains.

3. Remove the apostrophes or quotation marks on each domain value.

4. Add the corresponding values to the map.

2. Otherwise (if no delimited value is found): 

1. Split/substring the value to obtain separately the original and target domain values.

2. Trim each source and target domains.

3. Add the corresponding values to the map.

2. If the node value does not contain a COMMA (U+002C): 

1. Trim the leading and trailing white spaces of the string.

2. Check if there is a delimited value in the string (apostrophes or quotation marks): 

1. If one if found: 

1. Split/substring the value to obtain separately the original and target domain values.

2. Trim each source and target domains.

3. Remove the apostrophes or quotation marks on each domain value.

4. Add the corresponding values to the map.

2. Otherwise (if no delimited value is found): 

1. Split/substring the value to obtain separately the original and target domain values.

2. Trim each source and target domains.

3. Add the corresponding values to the map.

2. Check if there is a correspondence in the map for the input domain value.

1. If one is found:

1. Update the output domain value with the target value found in the map.

3. Return the output domain value.

 

One question related with the specification and the delimiters for the domain values. It says the following:

The values in the left or the right part of the mapping may contain spaces; in that case they MUST be delimited by quotation marks, that is pairs of APOSTROPHE (Unicode code point U+0027) or QUOTATION MARK (U+0023).

 

But I think that the quotation mark (") is the U+0022 hex combination. With the U+0023 I obtain the # character.

Please, correct me if I’m interpreting it wrong.

 

Thank you.

Cheers.

__________________________________

Mauricio del Olmo Martínez

Dpto. Técnico/I+D+i

Linguaserve Internacionalización de Servicios, S.A.

Tel.: +34 91 761 64 60 ext. 0421

Fax: +34 91 542 89 28

E-mail:  <mailto:tecnico@linguaserve.com> tecnico@linguaserve.com

 <http://www.linguaserve.com> www.linguaserve.com

 

«En cumplimiento con lo previsto con los artículos 21 y 22 de la Ley 34/2002, de 11 de julio, de Servicios de la Sociedad de Información y Comercio Electrónico, le informamos que procederemos al archivo y tratamiento de sus datos exclusivamente con fines de promoción de los productos y servicios ofrecidos por LINGUASERVE INTERNACIONALIZACIÓN DE SERVICIOS, S.A. En caso de que Vdes. no deseen que procedamos al archivo y tratamiento de los datos proporcionados, o no deseen recibir comunicaciones comerciales sobre los productos y servicios ofrecidos, comuníquenoslo a  <mailto:clients@linguaserve.com> clients@linguaserve.com, y su petición será inmediatamente cumplida.»

"According to the provisions set forth in articles 21 and 22 of Law 34/2002 of July 11 regarding Information Society and eCommerce Services, we will store and use your personal data with the sole purpose of marketing the products and services offered by LINGUASERVE INTERNACIONALIZACIÓN DE SERVICIOS, S.A. If you do not wish your personal data to be stored and handled, or you do not wish to receive further information regarding products and services offered by our company, please e-mail us to  <mailto:clients@linguaserve.com> clients@linguaserve.com. Your request will be processed immediately.”

__________________________________

 

De: Felix Sasaki [mailto:fsasaki@w3.org] Enviado el: lunes, 29 de octubre de 2012 18:04

Para: public-multilingualweb-lt@w3.org

Asunto: MLW-LT minutes 2012-11-29 and Doodle poll about 1/2 virtual f2f meetings

 

Hi all,

 

minutes of today's call are at  <http://www.w3.org/2012/10/29-mlw-lt-minutes.html> http://www.w3.org/2012/10/29-mlw-lt-minutes.html and below as text. 

 

Since we want to move forward to last call (= feature completeness and a stable spec) by the end of November, we decided to schedule a few 1/2 day virtual f2f meetings. Please enter your availibility at  <http://doodle.com/heh7k59h7vkvnv88#table> http://doodle.com/heh7k59h7vkvnv88#table

We will need at least three people for such a call to be able to move things forward. I added my name so that we have at leats one editor who can edit the spec right away. 

 

Best,

 

Felix

 

--

Felix Sasaki

DFKI / W3C Fellow

 

   [1]W3C

 

      [1]  <http://www.w3.org/> http://www.w3.org/

 

                               - DRAFT -

 

                               MWL-LT WG

 

29 Oct 2012

 

   [2]Agenda

 

      [2]  <http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0378.html> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Oct/0378.html

 

   See also: [3]IRC log

 

      [3]  <http://www.w3.org/2012/10/29-mlw-lt-irc> http://www.w3.org/2012/10/29-mlw-lt-irc

 

Attendees

 

   Present

          Yves, Pedro, Felix, Des, olaf

 

   Regrets

   Chair

          felix

 

   Scribe

          felix

 

Contents

 

     * [4]Topics

         1. [5]agenda

         2. [6]open issues

         3. [7]issue-52

         4. [8]aob

     * [9]Summary of Action Items

     __________________________________________________________

 

   <scribe> scribe: tbd

 

agenda

 

   [10] <http://www.w3.org/International/multilingualweb/lt/wiki/Lyo> http://www.w3.org/International/multilingualweb/lt/wiki/Lyo

   nNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda

 

     [10]  <http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda> http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda

 

   felix: any thoughts on the agenda draft?

 

   yves: agenda looks good, need to talk about the tests

 

   felix: should we split sessions?

 

   dave: hard to deal with

 

   felix: will have one room, rather larger

 

   pedro: objective is to resolve implementation issues

   ... open issues about metadata definitions are not in focus

   anymore

 

   felix: focus on 1st day for implementaitons, 2nd day for open

   spec issues

 

   [11] <https://docs.google.com/spreadsheet/ccc?key=0AgIk0-aoSKOadG> https://docs.google.com/spreadsheet/ccc?key=0AgIk0-aoSKOadG

   5HQmJDT2EybWVvVC1VbnF5alN2S3c#gid=0

 

     [11]  <https://docs.google.com/spreadsheet/ccc?key=0AgIk0-aoSKOadG5HQmJDT2EybWVvVC1VbnF5alN2S3c#gid=0> https://docs.google.com/spreadsheet/ccc?key=0AgIk0-aoSKOadG5HQmJDT2EybWVvVC1VbnF5alN2S3c#gid=0

 

   felix: what is the feeling about timeeline, the milestones

 

   revision of M2: "M2 (every implementor has run at least one

   global and one local test file, one XML and one HTML as well -

   by 15 December)"

 

   dave: if people don't do HTML or XML, they won't do parts of

   the above, but if they committ to things, they need to do it

 

   yves: milestones fine with me

 

   [12] <http://www.w3.org/International/multilingualweb/lt/wiki/Use> http://www.w3.org/International/multilingualweb/lt/wiki/Use

   _cases_-_high_level_summary

 

     [12]  <http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary> http://www.w3.org/International/multilingualweb/lt/wiki/Use_cases_-_high_level_summary

 

   [13] <http://www.w3.org/International/multilingualweb/lt/wiki/Lyo> http://www.w3.org/International/multilingualweb/lt/wiki/Lyo

   nNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda

 

     [13]  <http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda> http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda

 

   "review implementation isssues"

 

   yves: people on the call are the one who have listed issues

 

   felix: what is the state of phil (vistatec) implementation?

 

   dave: seen in seattle last week, again javascript based data

   for review

   ... not sure about implementation progress

 

   <scribe> ACTION: daveL to check implementation status from phil

   [recorded in

   [14] <http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action01> http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action01]

 

open issues

 

   [15] <https://www.w3.org/International/multilingualweb/lt/track/i> https://www.w3.org/International/multilingualweb/lt/track/i

   ssues/open

 

     [15]  <https://www.w3.org/International/multilingualweb/lt/track/issues/open> https://www.w3.org/International/multilingualweb/lt/track/issues/open

 

   felix: what are the most pressing issues in your opinion?

 

   yves: resolve the problem of the two reference mechanisms

   ... the "tool" issue, issue-42

   ... the other one: what to do with pointer attributes in global

   rules

 

   dave: that overlaps with XLIFF mapping

 

   yves: we have provenance and localization note

   ... these are natural have standoff markup

   ... for disambiguation it might be nice to have, but to be

   decided for the data category owners

 

   felix: about XLIFF, should we just wait?

 

   yves: the people in the XLIFF TC will promote that topic, hope

   that in a few days we will have an idea about the direction

 

   dave: need to decide in Lyon what we will do in ITS

 

   yves: we should assume that XLIFF will have "mrk" extension

 

   "ed. note"

 

   <scribe> ACTION: felix to make a doodle poll for 1/2 day calls

   in November [recorded in

   [16] <http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action02> http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action02]

 

issue-52

 

   [17] <https://www.w3.org/International/multilingualweb/lt/track/i> https://www.w3.org/International/multilingualweb/lt/track/i

   ssues/52

 

     [17]  <https://www.w3.org/International/multilingualweb/lt/track/issues/52> https://www.w3.org/International/multilingualweb/lt/track/issues/52

 

   [18] <http://www.w3.org/International/multilingualweb/lt/drafts/i> http://www.w3.org/International/multilingualweb/lt/drafts/i

   ts20/its20.html#Disambiguation-implementation

 

     [18]  <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation-implementation> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#Disambiguation-implementation

 

   [19] <http://www.w3.org/International/multilingualweb/lt/drafts/i> http://www.w3.org/International/multilingualweb/lt/drafts/i

   ts20/its20.html#domain-implementation

 

     [19]  <http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#domain-implementation> http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#domain-implementation

 

   yves: no discussion on algorithm since it has been posted

   ... need to make sure that test cases have the algorithm build

   in

 

   felix: could we close this and see what the test cases say?

 

   dave: ankit and leroy will be in Lyon to discuss this

 

   pedro: maurico is working on this

   ... mauricio will try to test this before thursday

   ... in real time system we will do it too

 

   [20] <http://www.w3.org/International/multilingualweb/lt/wiki/Lyo> http://www.w3.org/International/multilingualweb/lt/wiki/Lyo

   nNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda

 

     [20]  <http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda> http://www.w3.org/International/multilingualweb/lt/wiki/LyonNov2012#Thursday_1st_Nov:_MLW-LT_WG_meeting_agenda

 

   "session 0: "

 

   "session 4: spec review from Mārcis Pinnis, and review of

   editoral notes in spec"

 

aob

 

   felix: nothing, see or hear you soon in Lyon

 

Summary of Action Items

 

   [NEW] ACTION: daveL to check implementation status from phil

   [recorded in

   [21] <http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action01> http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action01]

   [NEW] ACTION: felix to make a doodle poll for 1/2 day calls in

   November [recorded in

   [22] <http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action02> http://www.w3.org/2012/10/29-mlw-lt-minutes.html#action02]

 

   [End of minutes]

     __________________________________________________________

 

 

    Minutes formatted by David Booth's [23]scribe.perl version

    1.137 ([24]CVS log)

    $Date: 2012/10/29 16:56:18 $

 

     [23]  <http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm> http://dev.w3.org/cvsweb/~checkout~/2002/scribe/scribedoc.htm

     [24]  <http://dev.w3.org/cvsweb/2002/scribe/> http://dev.w3.org/cvsweb/2002/scribe/

 

 

 

Received on Tuesday, 30 October 2012 12:36:54 UTC