RE: Feedback on Authoring Techniques for XHTML & HTML Internationalization

>[1] ...
I agree with the two functions:
 - Metadata
 - Text

Primary language should be the metadata and default
text language. One should try to simplify for people
working everyday with these documents

>> [2] "The text in the title must be language
neutral."
> I'm not sure why, if there's only a single language.

I agree. My statement in the document:
 - One primary language:
    + Title in this language.

 - Several primary languages:
    + Language neutral or empty.
    + In all the primary languages.

> [3] "meta element with the attribute http-equiv is
proposed because it is the only mechanism".  Although
one could say that theoretically declaring in the meta
element is equivalent to declaring in the http header
Content-Language, that is not the case in practise.

They are not the same. One has to separate:

 - Declaration of the primary language(s).

 - What the processors (e.g., servers, text
processing)  do with the declaration.

A document could have a declaration of a primary
language in http-equiv and the server could ignore it.
Indeed, this is the most common case today. 

> I find this statement, coupled with the following
that "servers should include the primary language(s)
in the Content-Language field" confusing.  Those are
two mechanisms.  The meta is not created
automatically.

Recommendations should indicate what the different
types of processors should do with the primary
language.

>Note also that in practise non of the user agents we
tested actually used the information in the meta
element to establish language - all of them used the
declaration in the html element, though.  A rule like
this requires all user agents to change their
behaviour if it is to be successful.

This the reason why one should accept the declaration
in the html element.

>[4] Why should text processors consider the primary
language the default text processing language?

Because one is declaring a document to be "en", it is
a resonable to assume that the default language is
"en".

> If it becomes undefined when several are declared,
this seems a poor strategy.

In your proposal is the same: one has to identify the
in text stream what is the language.

>[5] Your example of multiple language text marked up
in <title> cannot be done currently because HTML will
not allow markup in that element.  I do not see that
happening until we get to XHTML 2.0.  So this is not
workable for existing HTML/XHTML documents.  That's a
really big problem. (Note, by the way, that the
candidate for 'foo' is 'span'. That's standard
practise.)

I agree: I am identifying the problem, but I could not
suggest a solution. I assumed that one could have span
in title but I checked (a few years back) and I
noticed that it was not permited.

>Secondary proposal:

>[6] Again, this seems to operate on the premise that
there should be only one language declaration. I do
not see any justifications for this in your proposal.

I agree with the two funtions metadata and text, but
syntactically one should make it as simple as
possible. And this is the justification for one
language declaration. Indeed, it is not needed to have
two languge declarations. As I commented above, it is
resonable to assume that the metadata language
declaration is the default language. Indeed, the
opposite does not have sense.

>[6] "It is not proposed to use the xml:lang
attribute."  There are good reasons for using both in
hybrid XHTML 1.0 documents - so you can read in user
agents as HTML, but process as XML. I do not want to
debate the merits and demerits of using XHTML served
as text/html, but it is widely done, and I do not see
this as a practical requirement.  It is irrelevant for
HTML and for XHTML 1.1+ and XML.

This the worse offender: there is not reason to use
double declaration. Having the attribute lang is
sufficient. By the way, nothing breaks is one has and
attribute lang in XML.

>[7] Note that your proposal for multiple values for
the xml:lang attribute is currently not supported by
XML, and is unlikely to be supported in the near
future.  It is therefore ruled out for a large amount
of existing data.  (It's not clear from your proposal
whether you are proposing usage or changes to the XML
standard with this document.  If the latter, I don't
see any convincing arguments to change in your
document.)

It is under a section of "more work is needed"; i.e.,
an illustration of how thing could develop.

I am not proposing a change to XML {it would be easier
to change the Bible -:) }. It seems that with the
existing standard one could have several values in the
attribute xml:lang. From section "2.12 Language
Identification"

 "The values of the attribute are language identifiers
..."

"values" in plural. Nothing in the production rules.

Neither well-formed or valid documents would break:
the attribute xml:lang has to be declared in valid
documents.

This would have to be double checked. But if one
considere XML a syntactic layer, nothing has to
change. 

Regards
Tomas


Send instant messages to your online friends http://uk.messenger.yahoo.com 

Received on Thursday, 28 October 2004 20:55:28 UTC