4657 – Restrict the use of deref in sml:field/@xpath for SML implementation built on top of relational databases

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 4657 - Restrict the use of deref in sml:field/@xpath for SML implementation built on top of relational databases

Summary: Restrict the use of deref in sml:field/@xpath for SML implementation built on...

Status:	RESOLVED WONTFIX

Alias:	None

Product:	SML
Classification:	Unclassified
Component:	Core (show other bugs)
Version:	unspecified
Hardware:	PC All

Importance:	P1 normal
Target Milestone:	LC
Assignee:	Kumar Pandit
QA Contact:	SML Working Group discussion list

URL:
Whiteboard:
Keywords:

Duplicates (1):	4827 (view as bug list)
Depends on:
Blocks:

Reported:	2007-06-17 18:18 UTC by Pratul Dublish
Modified:	2007-11-29 19:27 UTC (History)
CC List:	1 user (show)

See Also:

Attachments

Description Pratul Dublish 2007-06-17 18:18:43 UTC

Implementation of SML identity constraints that use the smlfn:deref() function in sml:field/@xpath expressions is challenging for persistent SML stores built on top of relational database systems. The standards body to which SML gets submitted should investigate this and, if needed, explore options to ease the implementation burden for persistent SML stores using relational databases.

Comment 1 Kumar Pandit 2007-08-16 19:36:54 UTC

*** Bug 4827 has been marked as a duplicate of this bug. ***

Comment 2 Valentina Popescu 2007-08-28 13:23:25 UTC

per f2f decision, moving defect to LC

Comment 3 Kumar Pandit 2007-10-03 06:04:07 UTC

Proposal:
Disallow the use of deref in sml:field/@xpath for SML implementation built on top of relational databases.

Reasons:
[1]
The deref() calls can be nested to any level in the selector or field. Consider the following selector xpath.

deref(deref(deref(a)/b)/c)/d/e

To evaluate this identity constraint graph, we need to perform a 3 level recursion using a recursive CTE as the first step. Assuming a 10 ref fan-out at each level, it gives us 1000 nodes at the leaf level. This can be done in a relatively straightforward way using a single CTE call. For each of the documents in the target document set thus obtained, we need to apply the xpath /d/e to get the target node set.

Suppose we have to further evaluate a field xpath such as the one shown below.

deref(f)/g

Here, for each node in the target node set, we need to make a CTE call. Remember that the deref() can be nested to any level. To make matters worse, we may have a case where there are 2 (or more) field xpaths. One without deref() and one with a single deref() (and one with 2 level deref() and so on...). This requires us to combine the result set of the first CTE with that of each of the further CTEs. This is an extremely inefficient operation to perform in a large store.

If we can simply shift the deref(f) into the selector xpath (for example, deref(deref(deref(deref(a)/b)/c)/d/e)/f)/g) then we can compute the same result set in just one CTE.

There is a huge difference in the performance in the 2 cases.

[2]
Allowing deref() only in the selector support almost all practical cases. 

[3]
If we decide to add this support later it will be a non-breaking change. On the other hand, if we allow deref() in field now and decide to remove that support in later versions of SML, it will be a breaking change. Breaking changes are nearly impossible to make once a standard is adopted.

Comment 4 Valentina Popescu 2007-10-12 21:31:06 UTC

Attached is the result of investigating this issue as required by action http://www.w3.org/2005/06/tracker/sml/actions/118

http://lists.w3.org/Archives/Public/public-sml/2007Oct/0048.html