This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4639 - Allow cycle checking on element graphs as well as document graphs
Summary: Allow cycle checking on element graphs as well as document graphs
Status: RESOLVED FIXED
Alias: None
Product: SML
Classification: Unclassified
Component: Core (show other bugs)
Version: unspecified
Hardware: Macintosh All
: P2 normal
Target Milestone: ---
Assignee: Virginia Smith
QA Contact: SML Working Group discussion list
URL:
Whiteboard:
Keywords: resolved
Depends on:
Blocks: 4743
  Show dependency treegraph
 
Reported: 2007-06-13 00:46 UTC by C. M. Sperberg-McQueen
Modified: 2007-11-29 23:45 UTC (History)
0 users

See Also:


Attachments

Description C. M. Sperberg-McQueen 2007-06-13 00:46:01 UTC
In the current version of SML, sml:acyclic is defined with reference to 
(directed) cycles in directed graphs whose arcs are references of the
specified type and whose nodes are documents containing elements which
serve either as the source or the target of the references.  

Given that in many cases one could choose to organize a body of data
either in many small documents (say, 20,000 'student' documents,
each containing one 'student' element) or one large document (with a 
container and 20,000 'student' elements), it would clearly be useful for
schema authors and system designers to be able to prohibit cycles
either on document graphs (as currently supported) OR on graphs whose
nodes are elements.  Any use case for cycle checking on document-graphs
is also a potential use case for cycle checking on graphs of elements.

There may be implementation-related reasons to support document
graphs and not element graphs.  Some WG members suggested, for example,
that it's easier to point at documents than at elements in a stable way, 
so that it's easier to maintain an index for checking checking cycles 
in document graphs than in element graphs.  The WG will need to weigh
design issues (and the misfit between references pointing at elements
not documents, vs. cycles being checked only on graphs of documents)
and implementation issues.
Comment 1 Kumar Pandit 2007-10-03 04:54:16 UTC
I agree with the proposal from Virginia (CyclesInSMLModels.doc) that SML should define cycles as element based. The proposal leaves open the issue whether we should also have document based cycles. I propose that we should have only one type of cycles: element-based. That is, we should not have document based cycles.
Comment 2 James Lynn 2007-10-03 21:26:24 UTC
The proposed definition is quite brief and is marked as Normative below. I felt this also warranted some explanation of my thinking and is marked as Non-normative. I believe that some of the non-normative text, or something like it, should also be included in the spec, just enough to make the intent clear.

 

Normative: 

A path is defined as a set of elements connected by SML References of the same type. 

An element which has a path back to itself is a cycle.

 

 

Non-normative: Since the sole purpose of SML is to represent a model, I think it helps to be clear on what scenario we are trying to describe in the model. In simple graph terminology, a path is a set of nodes connected by an edge, or arc. A cycle can be defined as a path from a node back to itself. For the purposes of this proposal, the distinction between undirected graphs and directed graphs is ignored.

 

Start with the simplest and easy to understand Use Case. This makes counterexamples, etc., easier:

Model Domain: Genealogy

Our nodes are people, our edges or arcs are the relationship isParentOf. We want to ensure that our model does not allow a scenario where John isParentof Sue, Sue isParentof Sam, and Sam isParentof John.  

            Note that this does not preclude the same scenario but where in the last clause we have Sam isFriendof John. This is because the relationship(i.e., the edge or arc) is different. So the key in detecting cycles is that the path must be made up of edges of the same type to be a cycle.

            It is sometimes desirable to have type constraints on the nodes themselves, for example in the relationship isMotherof, the first node must be of type female but the second node may be either male or female. This can be accomplished in general by SML type constraints, including SML reference targetType, but is not considered as a criterion for cycle checking.
Comment 3 Kumar Pandit 2007-10-03 21:54:27 UTC
This is a good definition however it also needs to cover the following:
1. we need to define what 'elements connected by SML references' means.
2. Clarify if acyclic behavior is restricted to 'same type' or 'a type or any type derived from it'.
Comment 4 Sandy Gao 2007-10-04 17:32:47 UTC
The following email can be used to complete the proposal in comment #2 and to answer questions raised in comment #3.

http://lists.w3.org/Archives/Public/public-sml/2007Sep/0032.html
Comment 5 Virginia Smith 2007-10-11 13:37:59 UTC
Updated proposal document with Sandy's suggestion. Still need to clarify a few points with Sandy.
http://lists.w3.org/Archives/Public/public-sml/2007Oct/0066.html
Comment 6 Virginia Smith 2007-10-25 22:19:57 UTC
Updated proposal with latest from f2f.
http://lists.w3.org/Archives/Public/public-sml/2007Oct/0210.html
Comment 7 Kirk Wilson 2007-10-26 14:38:18 UTC
I would like to suggest a small editorial change to the definition of cycle.  (If you omit the phrase in commas in the current definition, the definition no longer makes sense, and the temporal "when" offends my sense that this is structural definition).  Suggested revision is quite simple:

A cycle is formed for an element E if the path that is formed by recursively following an SML reference that is a descendant of E to its target leads back to E.

Also, I don't recall that part of the discussion at the F2F reintroduced the sml:ref="true" attribute at the schema level (as represented in Proposals 2 & 3).  Was this reintroduced during the subsequent conference call?
Comment 8 Kumar Pandit 2007-10-30 18:37:07 UTC
During the Redmond f2f meeting, we had discussed a proposal where
-- we retain sml:acyclic on SML ref
-- we consider hierarchy of a ref when detecting cycles

For example, consider docs d1 & d2. d1 has element e1 and d2 has element
e2. Suppose sml:acyclic=true is defined on complexType A. In this case,
a cycle is formed between e1 & e2 if:
-- e1 is or contains SML ref of type A (or a type derived from it) to e2
AND
-- e2 is or contains SML ref of type A (or a type derived from it) to e1

Is this proposal covered in the latest doc? If not, was this
subsequently discussed in some email that I missed?

Comment 9 Virginia Smith 2007-10-31 23:11:04 UTC
Proposal 2 covers the first item (keeping the sml:acyclic on the ref). I think that proposal 2 in the doc degenerates into your example (with no xpath expression) when the xpath expression refers to "itself", meaning that <e sml:acyclic="./e"/>. (I think this will be the most common use case.)

Also, with regard to a ref hierarchy, I don't remember that discussion but consider the following:
If a Person has a ref to both Child and Friend and these 2 refs are both derived from PersonRef, that would be a problem when detecting Child or Friend cycles since they are distinct cycles but have the refs have same type. If I understand the 2 point (considering ref hierarchy), then the cycle test says "a ref of type A or a type derived from it". But if you put the acyclic on child, then if A is the type "personRef", the cycle would look at both friend and child and consider them both part of the same cycle which is not the intent. 
For example, consider that e1 in doc1 has a friend e2 in doc2 and e2  has a child e3 in doc3 and e3 has a friend e1 in doc 1. This is not a cycle but would look like one if you consider all refs of the same type to define a cycle.

Comment 10 Kumar Pandit 2007-11-01 08:47:27 UTC
If the schema author put acyclic=true on personref then it is his/her intent to detect cycles in personrefType and any type derived from personrefType. Thus it is by definition that friend cycle and child are not distinct. 

On the other hand, if the child and person cycles are meant to be distinct the one would not put acyclic on personref. In that case, one would put acyclic=true on child and friend separately.

The second example depends on where acyclic=true is defined. 
Comment 11 Virginia Smith 2007-11-04 20:13:09 UTC
The following are my recommendations for changing the specification:

1. Replace section 4.3.1 with the following text:
------
sml:acyclic is used to specify that a cycle is not allowed for an SML reference type.

Consider a directed graph whose arcs are SML references of a given complexType (and any derived types) and whose nodes are the set of elements pointed to by any of these SML references. A cycle is formed for an element E if the path that is formed by recursively following an SML reference (of a given complexType and any derived types) from E to its target leads back to E. 
Model validators that conform to this specification MUST support the sml:acyclic attribute on any <xs:complexType> element in a schema document. This is a boolean attribute and its value can be either true or false.
--------------

2. Replace section 4.3.1.3 with the following text:
----------------
If T is a complex type definition with {acyclic} true, then instances of T MUST NOT create cycles in any model. More precisely, the directed graph whose arcs are SML references of a given complexType (and any derived types) and whose nodes are all the elements pointed to by this set of SML references, must be acyclic.
----------------

3. Replace section 8.1.1 with the following text:
----------------
8.1.1 sml:acyclic 
Used to specify that an SML reference does not create any cycles in a model.

If this attribute is set to true for a complex type D, then instances of D (including any derived types of D) that are SML references cannot create any cycles in a model. In the following example, HostedOnRefType is a complex type declaration whose instances can not create a cycle:

<xs:complexType name="HostedOnRefType" sml:acyclic="true">
...
</xs:complexType>

If the sml:acyclic attribute is not specified or set to false for a complex type declaration ration, then instances of this type that are SML references may create cycles in a model.
------------------------
Comment 12 Virginia Smith 2007-11-04 20:15:57 UTC
I meant to add the following to the previous comment:

See also the working document at: http://lists.w3.org/Archives/Public/public-sml/2007Nov/0045.html
Comment 13 Virginia Smith 2007-11-14 03:24:50 UTC
It is easy to tell what node an SML reference points to (the reference target). But, after discussions with Sandy, I don't think that this proposal is adequate in identifying with certainty what node the reference comes from (the reference source). It has to be an ancestor of the reference element but which one? I still think that proposals 2 and 3 in the working document are too complex (and unnecessarily so) and I still like proposal 4 as implemented in comment #11 but I propose to amend it in one of the following 2 ways. (I'm still deciding which one I prefer and I have not updated the working document yet.)

1 - add an attribute to the sml reference that locates it's source element. 
<Prerequisite sml:ref="true" sml:acyclic="true" sml:refSource="ancestor::Course">
  <sml:uri>URI</sml:uri>
</Prerequisite>

positives: 
- This keeps all information required to specify the sml reference on the reference itself.

negatives: 
- In the case where a global reference type is used to define a relationship among different element types, the refSource attribute content cannot be specified in the schema but must be added (repeatedly) by the instance document author *each time* the reference is used in the xml since the name of the element representing the reference source may be different. 
- This model requires "looking up" the hierarchy of the document which is not normally the case. 


2 - Use the annotation as specified in proposal 1. The annotation specified the element as the reference source. 
<xs:complexType name="CourseType">
  <xs:annotation>
    <xs:appinfo>
      <sml:acyclic ref=".//Prerequisite"/>
    </xs:appinfo>
  </xs:annotation>
...

positives: 
- This model requires "looking down" the hierarchy which may feel more natural. 
- The specification of the reference source can be made once for each reference source type, that is, in the complexType definition of the source element type. This works even in the case where there are different element types participating as nodes in the cycle.
- Keeps the specification of the reference source in the schema which is also where the acyclic specification will likely be. (although on a different element).

negatives:
- The reference information (acyclic and refSource) is split into 2 locations. 
- Requires annotation inheritance to be supported by the SML validator.
Comment 14 Kirk Wilson 2007-11-27 00:08:20 UTC
I would "vote" for option 2 in Comment #13, but I don't have any other reason that that it "feels" more "right" than option 1.  My only "concern"--this is NOT an objection--with Proposal 4 is that its inherent recursiveness could be memory intensive. 
Comment 15 Virginia Smith 2007-11-29 00:06:56 UTC
Based on further discussions with Sandy, Kumar, and myself, the following proposal revision removes the necessity for one of the 2 options in comment #13. Essentially, this means that SML references are the arcs in the directed graph, targets of the references are the arc target nodes in the graph, and all ancestors of a reference are the source nodes of the arcs. 

The following is a further refinement of the proposal (replaces comment #11). Note that the section numbers are updated to reflect the current spec (editors' copy):
=================

1. Replace section 4.4.1 with the following text.
------------------------------
sml:acyclic is used to specify that a cycle is not allowed for an SML reference type. Model validators that conform to this specification MUST support the sml:acyclic attribute on any <xs:complexType> element in a schema document. This is a boolean attribute and its value can be either true or false.
------------------------------

2. Replace section 4.4.1.3 with the following text.
------------------------------
If CT is a complex type definition with {acyclic} true, then instances of CT MUST NOT create cycles in the model. More precisely, the directed graph constructed in the following way MUST be acyclic:

1. The nodes in the graph are all the elements resolved to by SML references of type CT or types derived from CT.

2. If a node N is or contains an SML reference R of type CT or a type derived from CT, and R resolves to T (which must also be a node in the graph), then an arc is drawn from N to T.
-------------------------------

3. Replace section 8.1.1 with the following text.
-------------------------------
8.1.1 sml:acyclic 

Used to specify that instances of an SML reference of a given type and its derived types does not create any cycles in a model.

If this attribute is set to true for a complex type D, then instances of D (including any derived types of D) that are SML references cannot create any cycles in a model. In the following example, HostedOnRefType is a complex type declaration whose instances cannot create a cycle:

<xs:complexType name="HostedOnRefType" sml:acyclic="true">

...

</xs:complexType>

If the sml:acyclic attribute is not specified or set to false for a complex type declaration ration, then instances of this type that are SML references may create cycles in a model.
-------------------------------

Comment 16 Kumar Pandit 2007-11-29 01:46:02 UTC
I agree with the proposal in comment# 15.
Comment 17 Valentina Popescu 2007-11-29 14:38:59 UTC
+1 for proposal described in comment #15
Comment 18 Pratul Dublish 2007-11-29 20:44:30 UTC
fix as per #15
Comment 19 Virginia Smith 2007-11-29 23:45:36 UTC
Made changes described in comment #15.