This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 2877 - Terminology data category
Summary: Terminology data category
Status: CLOSED DUPLICATE of bug 2969
Alias: None
Product: ITS
Classification: Unclassified
Component: ITS tagset (show other bugs)
Version: WorkingDraft
Hardware: PC Windows XP
: P2 normal
Target Milestone: LastCall20May
Assignee: Felix Sasaki
QA Contact: ITS mailing-list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-02-15 08:24 UTC by Felix Sasaki
Modified: 2006-07-21 17:49 UTC (History)
0 users

See Also:


Attachments

Description Felix Sasaki 2006-02-15 08:24:02 UTC
TBD until last call.
Comment 1 Christian Lieske 2006-02-17 11:11:16 UTC
While working on data category "terminology" I stumbled across some questions, 
which may pop up elsewhere as well:

1. What if the host vocabulary already has markup related to terms (see for 
example DITA and DocBook)? Do we recommend keeping it and mapping it via a 
documentRule? If so: Can this recommendation be generalized, and thus for 
example become part of the introduction to data categories?

2. What if the host vocabulary and our ITS markup related to terms only share 
some commonalities? Example: The DITA "term" element allows more than just one 
attribute with additional information? Do we suggest to

a. move stuff from ITS into host vocabulary
   
  <dita:term its:dir="ltr">PlateBroiler</dita:term>

b. move stuff from host vocabulary into ITS

  <its:term dita:platform="CoolOS">PlateBroiler</its:term>

Or do we suggest something completely different?

3. What if we have a clash of the information from the namespace of the host 
vocabulary and the ITS namespace? Example

<head>
  <documentRule its:term="yes" its:termSelector="//dita:term">
</head>
<body>
  <p>The highly visible <dita:term dita:translate="no">PlateBroiler</term> ...
</body>

4. What if the host vocabulary and ITS differ with regard to one of the 
following:

4.1 content model (for example PCDATA vs. mixed)
4.2 data type (for example NMTOKEN vs. CDATA)

In addition, I stumbled across some things which may only be relevant for the 
term data category

5. The "termRef" is a URI which consist of a termbase identifier prefix and a 
term identified suffix. Example:

<its:documentRules>
 <its:documentRule its:term="yes" its:termSelector="/body/p[1]/span"
its:termRef="http://example.com/termdatabase/#x142539"/>
</its:documentRules>

I wonder if there is a need to "factor out" the termbase identifier, since it 
will be the same for possibly dozens of terms. Example:

<its:documentRules termBaseRef="http://example.com/termdatabase/# ">
 <its:documentRule its:term="yes" its:termSelector="/body/p[1]/span"
its:termRef="x142539"/>
</its:documentRules>

6. I wonder if we need a recommendation related to Yomigana (phonetic strings; 
see http://esw.w3.org/topic/its0503ReqTermIdentification). We currently have 
not foreseen this as part of the term data category. I could  imagine a 
recommendation like 'Use "termRef" and put the Yomigana into your termbase'.




Comment 2 Yves Savourel 2006-03-02 20:53:45 UTC
Regarding 2:

I would think that in general, the markup of the host language should (and I 
am tempted to say must) be used if it offer an ITS-equivalent semantic.


Regarding 5:

I would think separating termRef into termRef and termBaseRef may make things 
a bit complicated from the viewpoint of implementation. It would be OK in a 
documentRule/s but locally knowing which termBase is to use may be more 
difficult.

Comment 3 Felix Sasaki 2006-03-03 00:08:02 UTC
Im trying to answer the questions using the proposals from the ITS f2f in Mandelieu.

(In reply to comment #1)
> While working on data category "terminology" I stumbled across some questions, 
> which may pop up elsewhere as well:
> 
> 1. What if the host vocabulary already has markup related to terms (see for 
> example DITA and DocBook)? Do we recommend keeping it and mapping it via a 
> documentRule? If so: Can this recommendation be generalized, and thus for 
> example become part of the introduction to data categories?

I agree with Yves. One addition: I would say

<its:termRule its:select="//qterm"/>
selects the <qterm> element and says "this is a term in the semantics of ITS".

<its:termRule its:select="//qterm"
its:termRef="http://www.example.com/termbase/#entry2332"/>
does in addition "adding", that is adding the term reference.

<its:termRule its:select="//qterm" its:termRefMap="@someTermRef"/>
would be instead of "adding" a "pass trough" of term reference information.
Maybe the name @termRefContent would be more approriate? ;)

> 
> 2. What if the host vocabulary and our ITS markup related to terms only share 
> some commonalities?

I'd say we can select everthing which has less or equal compositional semantics
as ITS. As for the terminology data category, our semantics has the parts "this
is a term" and "this is a term reference". Everything in an existing vocabulary
that can selected by these semantic components IMO should be selected.

> Example: The DITA "term" element allows more than just one 
> attribute with additional information? Do we suggest to
> 
> a. move stuff from ITS into host vocabulary
>    
>   <dita:term its:dir="ltr">PlateBroiler</dita:term>
> 
> b. move stuff from host vocabulary into ITS
> 
>   <its:term dita:platform="CoolOS">PlateBroiler</its:term>
> 
> Or do we suggest something completely different?

you could do <its:termRule its:select="//dita:term"/>
but I would not know what to do about the @dita:platform attribute.

> 
> 3. What if we have a clash of the information from the namespace of the host 
> vocabulary and the ITS namespace? Example
> 
> <head>
>   <documentRule its:term="yes" its:termSelector="//dita:term">
> </head>
> <body>
>   <p>The highly visible <dita:term dita:translate="no">PlateBroiler</term> ...
> </body>

You have two tasks: identifing <dita:term> as a term in the sense of ITS, and
the content of this element as not being translatable. I would keep the tasks
separate, so have
<its:termRule its:selector="//dita:term"/> and
<its:translateRule its:selector="//dita:term" its:translate="yes"/>

> 
> 4. What if the host vocabulary and ITS differ with regard to one of the 
> following:
> 
> 4.1 content model (for example PCDATA vs. mixed)
> 4.2 data type (for example NMTOKEN vs. CDATA)

Same as above: we can select everthing for a data category which has less or
equal compositional semantics as ITS. More fine grained information about
content models or data types will be lost.

> 
> In addition, I stumbled across some things which may only be relevant for the 
> term data category
> 
> 5. The "termRef" is a URI which consist of a termbase identifier prefix and a 
> term identified suffix. Example:
> 
> <its:documentRules>
>  <its:documentRule its:term="yes" its:termSelector="/body/p[1]/span"
> its:termRef="http://example.com/termdatabase/#x142539"/>
> </its:documentRules>
> 
> I wonder if there is a need to "factor out" the termbase identifier, since it 
> will be the same for possibly dozens of terms. Example:
> 
> <its:documentRules termBaseRef="http://example.com/termdatabase/# ">
>  <its:documentRule its:term="yes" its:termSelector="/body/p[1]/span"
> its:termRef="x142539"/>
> </its:documentRules>

I would not factor it out, since people might as well point to a place in the
current document.

> 
> 6. I wonder if we need a recommendation related to Yomigana (phonetic strings; 
> see http://esw.w3.org/topic/its0503ReqTermIdentification). We currently have 
> not foreseen this as part of the term data category. I could  imagine a 
> recommendation like 'Use "termRef" and put the Yomigana into your termbase'.

I don't think we need that.
Comment 4 Felix Sasaki 2006-04-20 08:32:27 UTC

*** This bug has been marked as a duplicate of 2969 ***