This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 3589 - Definitions of "schema document" draft proposal for bugs 2822 and 2846 PSVI and processor profiles
Summary: Definitions of "schema document" draft proposal for bugs 2822 and 2846 PSVI a...
Status: CLOSED FIXED
Alias: None
Product: XML Schema
Classification: Unclassified
Component: Structures: XSD Part 1 (show other bugs)
Version: 1.1 only
Hardware: PC Windows XP
: P4 normal
Target Milestone: ---
Assignee: C. M. Sperberg-McQueen
QA Contact: XML Schema comments list
URL:
Whiteboard: terminology cluster
Keywords: editorial, noFurtherAction
Depends on:
Blocks:
 
Reported: 2006-08-08 22:21 UTC by Noah Mendelsohn
Modified: 2009-10-15 19:29 UTC (History)
2 users (show)

See Also:


Attachments

Description Noah Mendelsohn 2006-08-08 22:21:10 UTC
Definitions of "schema document" draft proposal for bugs 2822 and 2846 PSVI and processor profiles

This is to get into bugzilla a comment I made on the Telcon of 4 Aug 2006.   The draft at [1] says:

"It is implementation-defined whether a schema processor can read schema documents in the XML transfer syntax defined here, or in the form of information sets which correspond to the XML syntax. (See Conformance (§2.4), which defines "·minimally conforming·" processors as those which cannot read schema documents in XML form, and "·schema-document aware·" processors as those which can.)"

My main concern is specifically with the text "schema documents in the XML transfer syntax defined here", which raises the question of what it means for something to be "defined" in our recommendation.  My strong preference is that we reserve the term "defined" for things which are marked up as:"[Definition:] XXXX".  In the particular case of the term -schema document- we have:

"To provide for this in an appropriate and interoperable way, this specification provides a normative XML representation for schemas which makes provision for every kind of schema component. [Definition:]  A document in this form (i.e. a <schema> element information item) is a schema document. "  I think that's pretty clear that what we define as a schema document is an "element information item", and hence an abstract Infoset.   Taking that narrow view of what it means for something to be defined in our recommendation, I don't think we define an XML Transfer Syntax for XML Schema Documents.

I think the closest we come is in [3], where we say:

"For interoperability, serialized ·schema documents·, like all other Web resources, should be identified by URI and retrieved using the standard mechanisms of the Web (e.g. http, https, etc.) Such documents on the Web must be part of XML documents (see clause 1.1), and are represented in the standard XML schema definition form described by layer 2 (that is as <schema> element information items).

Note:  there will often be times when a schema document will be a complete XML document whose document element is <schema>. There will be other occasions in which <schema> items will be contained in other documents, perhaps referenced using fragment and/or XPointer notation. "

Here I think we're referring to serializations,  but not >defining< anything.  

For the same reason, I'm concerned about the phrase that says:  "or in the form of information sets which correspond to the XML syntax".  That comes close to implying that we only define the serialization, but by the way there is a corresponding Infoset.  For the reasons quoted above, I think the reverse is true.  The formal definition of schema document is as an infoset, and by the way there is for each such Infoset a class of schema documents that correspond (differing, e.g. in whether their attributes use single quotes, the order of attribute serialization, etc.)

To be clear, I don't object to the spirit of what I think [1] is trying to say, just to the exact way it's stated.  In the spirit of offering concrete alternatives when there's a concern, I can think of at least two that would be fine with me, and I'm sure there are many other simple fixes that would be fine too:

Alternative 1:

"The exact form in which XML Schema documents are conveyed to a schema processor is implementation dependent.  In particular, they MAY be read, either from the Web or from other sources, in the form of XML 1.x serializations, and/or they MAY be conveyed through other means.   (See Conformance (§2.4), which defines "·minimally conforming·" processors as those which cannot read schema documents in XML form, and "·schema-document aware·" processors as those which can.)"

Alternative 2:

(add a definition and use it)

[DEFINITION:] A -serialized XML Schema Document- is an XML 1.x document corresponding to an XML -schema document- infoset. 

Then we can use something closer to the original text:

"It is implementation-defined whether a schema processor accepts schema information in the form of -serialized XML schema documents- and/or in some other form that conveys the -schema document- Infoset. (See Conformance (§2.4), which defines "·minimally conforming·" processors as those which cannot read schema documents in XML form, and "·schema-document aware·" processors as those which can.)"

A couple of other nits: I think references to schema document should hyperlink to the definition.  Also, some reference to 4.3.1 might also be helpful, though I'm less sure about that.  Thanks!

Noah

[1] http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.rq144.200607.html#infoset
[2] http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.rq144.200607.html#key-schemaDoc
[3] http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.rq144.200607.html#schema-repr
Comment 1 C. M. Sperberg-McQueen 2006-09-06 00:42:55 UTC
For the record, the references given in the description are to
member-only documents, but the same text is visible in the working
draft of 31 August 2006:
http://www.w3.org/TR/xmlschema11-1/#impl-def-list)

Note that the careful distinction Noah is trying to make between XML
documents and infosets is rather undercut by other portions of our
spec, which frequently refer to XML when on Noah's analysis it is of
deep importance that what they really mean is not XML but the infoset
is some XML or non-XML form.  The conformance clause, for example,
defines schema-document aware processors as those which "accept
schemas represented in the form of XML documents" as described in the
spec.  (The 1.0 version of the conformance section was even worse,
since the name it used for schema-document aware processors was
"processors conformant to the XML Representation of Schemas".)

So I observe that the confusion Noah is trying to combat, if indeed it
is a confusion, has deep roots in the existing text.

I have no particular attachment to the wording in the current working
draft, but I do have an attachment to saying that it's
implementation-defined whether an implementation can read schema
documents in XML form, and implementation-defined whether it can read
them in some other form which corresponds to an infoset.

That means I'd prefer not to replace "implementation-defined" with
"implementation-dependent", which means something different where I
come from.  In the QT specs, for example, implementations must
document their behavior for implementation-defined things, but not for
implementation-dependent things; I think it's important that this
property be documented in any claim of conformance, so I want
'defined' not 'dependent'.

And most practical implementations do not actually determine how
schema documents are conveyed to them.  The user does that at
invocation time, by passing schema representations or names of
schema documents in as parameters, or by setting or clearing 
flags which govern the search for components.  
All the implementation does is determine what
possibilities are supported.  So the wording "The exact form in which
XML Schema documents are conveyed to a schema processor is
implementation dependent" seems to me to state a falsehood.

On the general question: I agree with Noah that he tends to see the
infoset as primary and the XML as secondary, a serialization of that
infoset, while I tend to see the XML as primary and the infoset as
secondary, a description of (some of) the information in that XML
document.  I think the infoset spec similarly views the XML as central
and the infoset a secondary, but whether that is so or not, I find the
infoset-centric phrasing proposed in the description of this bug
decidedly confusing.

I think Alternative 2 is promising, though I would prefer that the
definition be the other way round.  So let me propose Alternative 3:

    [DEFINITION:] An -XML schema document- is an XML document or
    element whose information set is a schema document, as defined in
    this specification.

and then

    It is implementation-defined whether a schema processor can read
    XML schema documents, or schema documents in non-XML form.  (See
    Conformance (§2.4), which defines "·minimally conforming·"
    processors as those which cannot read schema documents in XML
    form, and "·schema-document aware·" processors as those which
    can.)

If the WG prefers to avoid making capitalization alone bear the
responsibility for distinguishing important terms, I will be happy (a)
to substitute another phrase ('schema document in XML form' would work
for me, although it's ugly), and (b) to adopt a name for our language
other than "XML Schema", which is a proper noun distinct from a common
noun phrase only by virtue of the capital S ("XSD" works for me, with
or without an official expansion into something like "XML Schema
Definition language").
Comment 2 Noah Mendelsohn 2006-09-06 01:13:36 UTC
Michael Sperberg-McQueen wrote:

> but I do have an attachment to saying that it's
> implementation-defined whether an implementation can read schema
> documents in XML form, and implementation-defined whether it can read
> them in some other form which corresponds to an infoset.

With the caveat that I haven't recently researched all the related text in the spec., I don't have any problem with what Michael is proposing above.  I have an "attachment" to the spirit of the existing text which says:

"The principal purpose of XML Schema: Structures is to define a set of schema components that constrain the contents of instances and augment the information sets thereof. Although no external representation of schemas is required for this purpose, such representations will obviously be widely used. To provide for this in an appropriate and interoperable way, this specification provides a normative XML representation for schemas which makes provision for every kind of schema component. [Definition:]  A document in this form (i.e. a <schema> element information item) is a schema document. "

So, a "Schema document" is by definition an information item. It's also very clear from the above that when we refer to the "XML Representation" we are using that as shorthand for an "XML Infoset conveying the Representation."  If anything else in the existing text suggests otherwise, I think it should be changed to match these definitions.

The text that Michael advocates above is basically saying:  "it's implementation defined as to exactly what concrete representation is used to convey the infoset" for such schema documents.  I think that's right, and I am reasonably happy with his proposed phrasing.
Comment 3 C. M. Sperberg-McQueen 2006-09-06 02:11:12 UTC
Note that the wording cited in comment #2 makes a hash of the claim
that the XML Schema spec does not define an XML transfer syntax, which
was the initial premise of the issue.  If the XML representation is 
normative, then clearly the spec defines an XML transfer syntax, not 
solely an infoset which may be transferred in any way found convenient.  
Since interoperability is not served by infosets (which are abstractions
independent of any testable representation like APIs or data formats), the 
sentence quoted cannot reasonably be taken to be describing the *infoset*.
The sentence clearly says that the XML representation is normative.

As for Noah's attachment to the idea that the infoset is the key
thing, not the XML form, it's a touching thought, but hardly relevant
to this issue, since nothing said so far has entailed any suggestion
to the contrary.  

As for replacing the phrase "XML representation" with the phrase
"XML Infoset conveying the Representation", I do not think it's an
editorial improvement and don't believe it would make our spec easier
to read, understand, or reason about.
Comment 4 Noah Mendelsohn 2006-09-12 18:15:27 UTC
I(Noah Mendelsohn) wrote:

> I don't have any problem with what 
> Michael is proposing above.

[...]

> It's also very clear from the 
> above that when we refer to the
> "XML Representation" we are using
> that as shorthand for an "XML
> Infoset conveying the
> Representation."


Michael Sperberg-McQueen wrote:

> As for replacing the phrase "XML 
> representation" with the phrase
> "XML Infoset conveying the Representation",
> I do not think it's an editorial improvement
> and don't believe it would make our spec easier
> to read, understand, or reason about.

I agree, and I don't believe I proposed such a change.  As noted above, I am (and have been) content with the proposed resolution of this issue.
Comment 5 C. M. Sperberg-McQueen 2008-02-04 16:17:26 UTC
In an effort to make better use of Bugzilla, we are going to use the
'severity' field to classify issues by perceived difficulty.  This 
bug is getting severity=minor to reflect the existing whiteboard note
'easy'. 
Comment 6 C. M. Sperberg-McQueen 2008-02-05 02:31:50 UTC
A wording proposal for this issue (among others) was sent to the XML
Schema WG on 4 February 2008.

http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.consent.200802.html (member-only link)

For some issues, the proposal is effectively to make no change;
see the Status section of the proposal for the specifics.
Comment 7 Noah Mendelsohn 2008-02-05 15:24:21 UTC
Commenting on the proposal at [1]which includes:

> 2. It is implementation-defined whether a schema
> processor can read schema documents in the form of
> XML documents. (See Conformance (§2.4), for
> distinction between "minimally conforming"
> processors and "*schema-document aware*" processors.


The proposed text isn't quite working for me for a number of reasons.  First of all, I think we agree that the term "schema-document" refers to the specific form of XML Representation of schemas that we we set out, I.e. that validates per the S4S, etc.  No controversy there I'd think.  So, it seems to me that one could in principle write processors that accept schema information in any or all of these forms:

1) *schema documents* as we define them with the termref in our spec.

2) Other forms of XML that convey the information needed to create or determine components.  The dump format from XSV -r comes to mind as an example.

3) Non-XML forms

The proposed text says that some processors "can read schema documents in the form of XML documents", and I find that confusing for a few reasons.  First of all, it seems to be using the un-hyperlinked phrase "schema documents" for something more general than its hyperlinked equivalent. I found that confusing and I think other readers may too.  Secondly, taken with the rest of the text, it implies that the only XML-form possibility is in fact the (hyperlinked termref) *schema document* that is required by the reference to *schema document aware* processors.

Alternate wording proposal:

> 2. Whether a *minimally conforming* processor is
> additionally able to which accept schemas
> represented in the form of XML documents as
> described in Layer 2: Schema Documents, Namespaces
> and Composition (§4.2) is implementation defined.
>  (See Conformance (§2.4), which defines 
> "schema-document-aware" processors as
> processors as those that can  process schema 
> documents in this form.)

Note that the phrasing starting with "schemas represented in the form..." is copied directly from 2.4, and so introduces no new complications or imprecision (I hope).  Also, note that the entire proposal is intentionally modeled on the existing point #3, which says:

> 3. Whether a schema-document aware processor is
> able to retrieve schema documents from the Web is
> implementation-defined. (See Conformance (§2.4),
> which defines "Web-aware" processors as
> schema-document aware processors which can
> retrieve schema documents from the Web.)

Noah

[1] http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.consent.200802.html
Comment 8 C. M. Sperberg-McQueen 2008-02-08 23:27:11 UTC
The WG today accepted the wording proposal mentioned in comment #6,
with the agreement that it represented a partial but not a complete
resolution of this issue.  The editors accepted the advice of the WG
that it would be good to revisit the wording in question if possible.

Accordingly, I'm marking this issue needsDrafting and revising its
expected-effort estimate.
Comment 9 C. M. Sperberg-McQueen 2008-05-27 12:39:57 UTC
Since the issue raised here appears solely to concern wording and tone,
and not the boundaries of conformance, I am marking it editorial.  That
has as a consequene that it may be dealt with after, rather than before,
the next public working draft is published.
Comment 10 C. M. Sperberg-McQueen 2009-10-10 00:31:38 UTC
In August and September 2009 the XML Schema working group performed
triage on the remaining open issues in a WBS poll [1], whose results
are summarized at [2] and accepted formally at [3]. In the course of
that triage we decided to close this issue without further action.
Since this is a WG issue, not an external one, I'm going both to mark
it resolved and to close it.  Since the issue was partially resolved, I'm
marking it FIXED, not WONTFIX, but really it's half and half.

[1] http://www.w3.org/2002/09/wbs/19482/200908CRissues/
[2] http://lists.w3.org/Archives/Member/w3c-xml-schema-wg/2009Sep/0005.html
[3] http://lists.w3.org/Archives/Member/w3c-xml-schema-ig/2009Sep/att-0005/2009-09-11telcon.html#item04
(all links member-only)
Comment 11 Noah Mendelsohn 2009-10-15 19:29:24 UTC
For the record, I accept the working group's decision as documented in Comment #10.  Given the many other important things to be done for XSD 1.1, I think it's an appropriate compromise to close this without further action.  I thank the working group for its careful consideration of my concern.

Noah