See also: IRC log
Goutam presents his three layer scheme
CL: a comment
... we have covered the content domain level already in our requirement documents
... the second level, the sentence level
... information you suggest like "this is an question"
... that information would ensure accurate translation
... that level can be covered by a host schema
... for example the TEI has elements which could cover the sentence level
... as for the third level, the word level, that is related to terminology work
... e.g. you might say "that term is the expression of a concept 'bank'"
<SebastianR> an example of TEI word-level markup:
CL: we addressed the terminology realm in the requirement document
<SebastianR> <u trans="smooth" who="PS1BY">
<SebastianR> <s n="25">
<SebastianR> <w type="DTQ">what</w>
<SebastianR> <w type="VBZ">'s </w>
<SebastianR> <w type="EX0">there </w>
<SebastianR> <w type="TO0">to </w>
<SebastianR> <w type="VVI">put</w>
<SebastianR> <c type="PUN">, </c>
<SebastianR> <w type="VVD">took </w>
<SebastianR> <w type="AT0">an </w>
<SebastianR> <w type="AJ0">extra </w>
<SebastianR> <w type="CRD">twenty </w>
<SebastianR> <w type="CRD">thousand </w>
<SebastianR> <w type="PRP-AVP">on </w>
<SebastianR> <w type="PRP">from </w>
<SebastianR> <w type="AT0">the </w>
<SebastianR> <w type="NN1">beginning</w>
<SebastianR> <c type="PUN">?</c>
FS: the categories of Goutam's schema can also be expressed as attributes
... e.g. <s type="praying">
RI: So you want to use three attributes which should be available everythere?
GO: No, I will explain
Goutam explains the proposed scheme
CL: We agree that these three levels of information will give us a lot of benefit
GO: You will get meaningful output
CL: I do not agree that it will solve every problem of translation
... e.g. dialogue systems and machine translation systems
... they do not just know about pos, sentence cat, domain
... but really about complex transfer conditions
... you need that information to do accurate work
... our scope is not to tell people who built parsers how to do that
... what you propose can be a part of our guidelines
... which show that you cannot do an accurate translation without such information
... so as a guideline for schema authors: please provide that information
... then people like the TEI people can see if they have covered that topic
... or people who develop new schemas will read the ITS guidelines and create their schemes in that way
FS: would anybody disagree to put this into the guidelines?
CL: I would put it into the guidelines and add a fourth level
... the guidelines should say
... please provide as many context as possible (i.e. "context" as a fourth level)
... "please don't give every sentence to a translator seperately, but give the translator the context"
... e.g. the translator should see not only the content of a XUL element, but the other parts of the XUL document respectively
YS: how about dialect specification?
... would that be part of the requirement for lang / locale specification?
RI: we should mark it up for language
... a question on the purpose of the three layer scheme:
... do you expect content authors to mark up that layers?
GO: It might be, but not necessarily
RI: so this markup would be used by a linguistic person?
GO: Maybe even a "simple" person
... e.g. students who study grammar, first language / second language grammar
YS: I would not use such markup because I'm bad at grammar ...
CL: I share Yves feelings
... some authors have difficulties to provide this information
RI: Is this for use by machines?
... if that is the case, the tokens have to be machine recognizable
... it seems to be difficult for an ordinary person to use such information
CL: on RI's question whether this is for humans or for machines
... I think information about the domain, sentence type or specific words
... it will help translators to do better quality work or to do the work quickly
... if they know that a word belongs to a specific domain
... they can go to a terminoloy data base and check the word
... so even for human translators this might be helpful
... e.g. "this is a computer interface string" is a helpful information
... for my understanding, the human use scenario is not only for the translators
... but also authors or for quality assurance
RI: That is a different topic for quotation
... the example you give with term data bases
... a machine, not a human will look up the data base
CL: As for terminology
... the translator has to be made aware of the fact that s.t. is a term
FS: that is then the terminology requirement
YS: I propose to have an action item to work on the document Goutam started
<scribe> ACTION: Goutam to continue work on the document he started to see if we should put that into the guidelines, including the aspect of language / dialect identifaction [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action01]
YS: we talked about pointing to XSLT
... I think there are different kinds of mapping
... i.e. a 1:1 mapping
... sometimes we have to map elements to attributes
... so how could we address that
... e.g. s.t. like the DITA translate would be easy to map
FS: how about the question when the mapping takes place?
... e.g. "on the fly" during processing or before / after processing?
YS: We don't have to specify that
FS: Yes, we can leave that to the people who use the mapping
YS: let's collect examples what kind of mapping we need
SR: If you say that the translate attribute of ITS maps to DITA translate
... that could only be a clue for a human
CL: One of our requirements is to mark up terminology
Dita localization aids: http://www-306.ibm.com/software/globalization/topics/dita/localization.jsp
CL: I saw simple mapping of mapping priporitary language identifiers to offical ones
... e.g. people would use numeric values to identify languages
RI: so you map values, right?
... we need a list of data categories what is to be mapped
... e.g. translatability, constraints
... and then a list of the mappings
YS: we don't want to force people to use ITS if they already have the information
RI: But that is a different use case, right?
... if DITA does exactly the same thing, that do we have to do?
SR: DITA attributes are not in a namespace
... so it would be no problem if we use that
... in DTDs, you could hard wire prefixes
RI: hard wired means "changing the schema"?
YS: if we are stepping out of the namespace realm
... we might run into clashes
SR: by proposing the automatic mapping
... that is a burden for the processing application
YS: true, but the tools can be very generic
FS: would it be a possibility to approach the DITA people and make an agreement with them on what one should use for "translate"?
YS: it would be mainly the case in terminology
FS: example with architectural forms: http://www.w3.org/People/fsasaki/EML2005sasa0411.html section 4.3.1
CL: If we establish what the indicator of translatability is
... that would be very helpful
... the "equiv" would be helpful for people who are in the process and the people who use this
... of course we might have problems which RI and SR mentioned
... so we could provide a container for mapping
... and have suggestions how to fill the container
... e.g. with xslt
FS: so an "extensible" container?
RI: that sounds like localization property stuff
YS: to some degree
... the problem is: we have schemas
... which we cannot process
because their is no generic way of applying their l10n related information
scribe: to the ITS sensitive tools
RI: somebody has to do that at some point
... and the schema is the best place to have that information
RI: Another issue
... if you put that into our schema, the dita schema might change
<YvesS> ..FS to show the example with Architectural forms.
RI: Would that not be localization properties work?
YS: In some way
... if there is an w3c way of mapping
... we could just adopt it
... I want to say
"img" is a graphic
scribe: as the tool processes "im"
... it should be processed like a graphic
RI: That is not a tag set again, that is localization properties
YS: yes, but we have that existing requirement
... we thougth we have a common goal, but maybe not
CL: We will solve the need to provide information about correspondences
... we recommend that to people and have an element / attribute that points to a mapping
... then we say that people can consider different things like xslt or architectural forms
YS: That is one part
... in addition we need to look at the type of mapping we need
... I want to know what will be mapped
... I want a pointer to xslt
... and s.t. that says "what is mapped"
here some xslt
RI: If you have xslt stylesheets you would tie what to a specific version of DITA
CL: should that be another section of the WD?
... we have discussed this far enough
... we should be able to make a statement about the mapping
... we should not prescribe how the mapping is realized
YS: just a place holder that the mapping exists
<scribe> ACTION: CL and FS to decide who will edit the mapping section of the ITS implementation WD [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action02]
YS: we have three documents
requirements, ITS guidelines, ITS specification
YS: no editor for the guidelines yet
... AZ mentioned he would to some editor work
... I will do some editing as well of the ITS guidelines
<scribe> ACTION: YS as the initial editor for the ITS guidelines, Diane helping [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action03]
RI: please before you do editing, please read:
scribe: and follow the guidelines
FS: don't spend much time on the status section
RI: that is important only before publication
YS: In september, we would like to publish the first WD of the ITS specification
... the second publication of the req. document is november
... the first publication of the "ITS techniques" (before called "ITS guidelines")
... so we don't have to change our deadlines know
SR: You would need to write an "ODD2XMLSPEC.xsl"
xmlspec i18n dtd from http://www.w3.org/International/xmlspec/002/xmlspec-i18n.dtd
i18n specific elements: http://www.w3.org/International/xmlspec/002/i18n-elements.mod
YS: let's continue that discussion by email
FS: proposal to have a focus on the ITS specification and ITS techniques
... the req document should only be updated from time to time
YS: how about the wiki editing?
... how do the keep track of the changes if we publish a new WD?
... do we have to change everything in the wiki?
... in the document with div, del, ins?
... that takes a lot of time
CL: does everybody needs to modify the req documents in the wiki?
... maybe we could say we move away from the wiki
RI: If you have a contenious subject
... there is a lot of mail discussion
... it is difficult to summarize discussions
... as for the wiki, you can see what is being talked about
FS: how to handle the ITS techniques and the ITS specification?
... also handling in the wiki? i.e. converting ODD (possibly ODD) into the wiki
YS: that is a general problem for all three documents
RI: what would you do with an image?
bugzilla example: http://www.w3.org/Bugs/Public/show_bug.cgi?id=1334
<SebastianR> Christian/Felix: grab http://users.ox.ac.uk/~rahtz/its.zip and see the Makefile
<SebastianR> (that is the ODD demo to see if you can reproduce)
<YvesS> YS: we will discuss requirements
<YvesS> .. and does anybody has another requests
<YvesS> ACTION: For YS to post message about meeting f2f Dec-14 to 16 (noon). [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action04]
1) should the req be in the techiques doc / in the specification doc?
2) is the req sensitive to the scope problem we discussed at the f2f?
http://esw.w3.org/topic/its0506ReqConstraints in spec, sensitive to scope
http://esw.w3.org/topic/its0503ReqSpan in spec, not sensitive to scope
http://esw.w3.org/topic/its0503ReqEntities part of techniques doc
http://esw.w3.org/topic/its0503ReqLangLocale part of the techniques document
http://esw.w3.org/topic/its0503ReqTermIdentification probably techniques doc, depends on how we develop it
http://esw.w3.org/topic/its0504ReqPurposeSpecMap we don't know yet
<scribe> ACTION: felix to ask w3c if there is a methodology for mapping exisiting / under development [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action05]
http://esw.w3.org/topic/its0908LinguisticMarkup we don't know
http://esw.w3.org/topic/its0504ReqCulturalAspects maybe a technique, but we don't know yet
YS: part of the techniques, with a "?"
... good practice would be to provide an attribute to give feedback to the translator
SR: like an alt tag on an link which is specific for its
FS: so part of the specification?
example for such a link:
<para> If you create a typing error like "strs(s)",
you will get the message
discussion about the linked text requirement
RI: this is one driver of the original ITS work
... originally we said to SSML folks that they need bidi markup for accessebility
... they asked us for a coherent way of doing that
... so we started this effort: ITS (initially)
... it would be nice to have this as part of the xml ns, but that is not likely to happen
YS: so this is part of the spec and the techniques doc
FS: And this is not part of the scope issue
http://esw.w3.org/topic/its0505Translatability part of the spec and the techniques
YS: and we do need scope
SR: thinks like bidi are more part of i18n , most of the other stuff we talked about are part of l10n
YS: this would be a guideline / technique
... SR said that we need to make the difference between universal things (like bidi) and l10n specific things
RI: some thinks we might say "please use these tags" ..
... there might be s.t. like "please don't do this" like translatable text in attributes
... and the third category would be "here is s.t. you could use"
YS: like the ITS tag set?
... and we would make clear what aspect would be important
YS: back to metrics: what should it be?
... metrics does not enhance the localizer, I think
YS: this is a guideline
SR: a guideline of good practice and an instruction
YS: please avoid s.t. like: <Message001>Cannot open the file.</Message001>
... more and more the name are the same as the content
... or s.t. generic because they use non-xml tools for the generation of xml
RI: they should use IDs for ids, and not the name of the element
this is guidelines
SR: this is like its:info
... you might want to say "who said that"?
YS: so that means specification, and it has to do with scoping
YS: explains the req
from the xml rec:
The value "default" signals that applications' default white-space processing modes are acceptable for this element; the value "preserve" indicates the intent that applications preserve all the white space.
YS: so this would be a guideline
YS: it is an issue for the localization process
... and a guideline
SR: It depends on how you manage the process
YS: part of the specification
RI: Steve wants to have a different ruby spec
... which is not so presentation oriented
... I want to have a different level of conformance
<YvesS> .. three levels would be better
<YvesS> RI: wonder if we should separate attribute and element in scoping (even for translatablity).
RI: we don't want to provide a tag set for bad practice
... but we can show them how to get out of trouble
YS: We don't have a solution for attributes, so we can only have the element content case in the spec
SR: what is the value of knowing it is a date?
... you can just use the data type "date"
... is it different than marking up technical terms as terms
RI: it gives you the date itself
... i.e. a machine could transform it into a specific calendar etc.
YS: I put that as a guideline, and we see what will happen
<scribe> ACTION: Sebastian to introduce to the wg the l10n / i18n aspects of the TEI [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action06]
YS: goes to the guidelines
<scribe> ACTION: SR to put a comment on http://esw.w3.org/topic/its0509ReqNestedElements in the wiki [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action07]
<scribe> ACTION: Felix to make proposals by mail for a shortcut for the namespace of the ITS spec wd [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action08]
action items for monday: http://www.w3.org/2005/09/19-i18n-minutes.html#ActionSummary
<scribe> ACTION: to contact Deborah A. Lapeyre (DITA commitee) about the relation between its / DITA [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action09]
action item for tuesday: http://www.w3.org/2005/09/20-i18nts-minutes.html#ActionSummary
<scribe> ACTION: RI to check for hosting the f2f near Oxford (December, 14-16 (noon)) [recorded in http://www.w3.org/2005/09/21-i18nts-minutes.html#action10]
YS: drop http://www.w3.org/2005/09/19-i18n-minutes.html#action04 and
... these are not necessary anymore
GO: a different topic: computational or "semantic" linguistic markup
YS: Thanks to everybody
GO: Thanks to you all
... I was happy to be able to come