24564 – [Imports]: Blocking circular reference in the import tree/list

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24564 - [Imports]: Blocking circular reference in the import tree/list

Summary: [Imports]: Blocking circular reference in the import tree/list

Status:	RESOLVED FIXED

Alias:	None

Product:	WebAppsWG
Classification:	Unclassified
Component:	HISTORICAL - Component Model (show other bugs)
Version:	unspecified
Hardware:	PC Linux

Importance:	P2 normal
Target Milestone:	---
Assignee:	Dimitri Glazkov
QA Contact:	public-webapps-bugzilla

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	20683 24565
	Show dependency tree / graph

Reported:	2014-02-06 17:13 UTC by Gabor Krizsanits
Modified:	2014-02-13 21:31 UTC (History)
CC List:	7 users (show)

See Also:

Attachments

Description Gabor Krizsanits 2014-02-06 17:13:08 UTC

I'm not sure why exactly were the tree changed to list in the spec., in my mind it's more like a map between uri's and imports, but at very least it should be a set. But one thing this change suggests that we do not necessary disallow circular references among imports. Which I sympathise with, in many cases it's OK to do that. Adding a new edge to the import graph dynamically that results a directed circle after everything was loaded successfully should just work.

However. As the spec. stands, if we have a script blocking circular reference we might end up blocking scripts indefinitely. In that case we think (I talked to Boris and Blake about this) we have to break the circle, and this part should be specced out carefully so we don't end up with different behaviours per browser agents.

Comment 1 Morrita Hajime 2014-02-06 19:08:19 UTC

Thanks for the feedback Gabor!

First, map sounds good to me. Let me think about that.
One reason to use list than tree is that I don't want to have
special-casing the tree root, which isn't an import but the master.

Talking about blocked-by-circular reference, de-dup mechanism
should prohibit it - The cycle should be prevented as a de-dedup,
as "require"-like feaures in many programming languages do.

I agree that the current spec doesn't write about that.

Comment 2 Morrita Hajime 2014-02-07 02:00:12 UTC

(In reply to Morrita Hajime from comment #1)
> Thanks for the feedback Gabor!
> 
> First, map sounds good to me. Let me think about that.
> One reason to use list than tree is that I don't want to have
> special-casing the tree root, which isn't an import but the master.

Uh, it should be list as the order matters. 
It represents a flatten tree with no root.

Comment 3 Gabor Krizsanits 2014-02-07 17:58:23 UTC

I understand that many programming languages disallow cycles, as it simplifies things. But imports are a bit special... 

This list idea is still not clear. First it does not scale compared to a map I'm afraid. I don't understand how will a de-dup mechanism prevent a circle. An edge that causes a circle does not add any new element to the list, the element is already there, it just trying to use it as a second referring link element would. 
I don't understand the ordering either, why is it important? And what happens if I start moving import links around in the DOM (not to mention adopt children among imports)? Or do you want the list to represent a topological order to prevent directed circles? For that I think we would need a list of the link elements not the imports/import loaders.

Sorry, for the many questions but I'm confused, and I think this is a very important problem to address before the release.

Comment 4 Morrita Hajime 2014-02-07 18:50:36 UTC

(In reply to Gabor Krizsanits from comment #3)
> I understand that many programming languages disallow cycles, as it
> simplifies things. But imports are a bit special... 
> 

> This list idea is still not clear. First it does not scale compared to a map
> I'm afraid. I don't understand how will a de-dup mechanism prevent a circle.
> An edge that causes a circle does not add any new element to the list, the
> element is already there, it just trying to use it as a second referring
> link element would. 

Sorry for unclear my explanation.
First let me clarify what cycle prevention that I meant.
I think I used that word in wrong sense.

Think a about this case:

* index.html
  <link id=li rel=import href="a.html">
* a.html
  <link id=la rel=import href="b.html">
* b.html
  <link id=lb rel=import href="a.html">

In this case,

- li triggers loading of a.html. 
- la triggers loading of b.html.
- lb reuses shares(de-dups) a.html that is previously loaded

Now la.import points b.html and lb.import points a.html.
Here is a cycle. But this is fine because a.html document is shared
and there are no infinity loading loop happening.
this endless loading is what I meant cycle.

IMO, circular reference itself is totally fine,
and it seems your point is same (isn't it?)


> I don't understand the ordering either, why is it important? And what
> happens if I start moving import links around in the DOM (not to mention
> adopt children among imports)? Or do you want the list to represent a
> topological order to prevent directed circles? For that I think we would
> need a list of the link elements not the imports/import loaders.

On ordering, current spec doesn't depend on that.
I have a vague plan to use it to resolve some racy cases.

Let me try it in coming changes. If it turns out order doesn't matter.
I'm happy to go back to the map. But I think it matters
because Blink implementation actually relies on it.
Well, I admit that writing spec sometimes leads better way to do it :-)

On relationship between DOM mutation and import list,
my intention to have the "import list" concept explicitly is that
DOM mutation doesn't affect the list order once the import is inserted
into the list by the loading algorithm.

I want it rather static because it changing it dynamically will cause
the nightmare in dependency management perspective. 
Static order can avoid such complexity.

> 
> Sorry, for the many questions but I'm confused, and I think this is a very
> important problem to address before the release.

I really appreciate that you take time to clarify the points and help it
polished.

Comment 5 Boris Zbarsky 2014-02-07 18:53:40 UTC

> But this is fine because a.html document is shared and there are no infinity
> loading loop happening.

Right.  The only question is script execution order.  Normally we guarantee that scripts in an import run before you keep parsing the importing document, but that's not feasible in this cyclic reference case, right?

Comment 6 Morrita Hajime 2014-02-07 19:56:25 UTC

(In reply to Boris Zbarsky from comment #5)
> Right.  The only question is script execution order.  Normally we guarantee
> that scripts in an import run before you keep parsing the importing
> document, but that's not feasible in this cyclic reference case, right?

Right. I believe it isn't problem in practice as long as it is deterministic.

The execution order in general is unclear in the current spec.
Bug 24565 is to address that problem.

Comment 7 Boris Zbarsky 2014-02-07 19:57:45 UTC

> I believe it isn't problem in practice as long as it is deterministic.

Agreed.

Comment 8 Gabor Krizsanits 2014-02-07 21:00:54 UTC

(In reply to Morrita Hajime from comment #4)
> IMO, circular reference itself is totally fine,
> and it seems your point is same (isn't it?)

Yes I think so too.

> Let me try it in coming changes. If it turns out order doesn't matter.
> I'm happy to go back to the map. But I think it matters
> because Blink implementation actually relies on it.
> Well, I admit that writing spec sometimes leads better way to do it :-)

Sounds good, in case it turns out you think you have to rely on it, I would be interested in why exactly. Unless it's something very blink specific. So far map seems to work just fine for me, but maybe I'm just missing some important detail...
Although racy cases, can be tricky, it depends on how do we want to handle them... I think the best would be to get back to this later.

> 
> On relationship between DOM mutation and import list,
> my intention to have the "import list" concept explicitly is that
> DOM mutation doesn't affect the list order once the import is inserted
> into the list by the loading algorithm.
> 
> I want it rather static because it changing it dynamically will cause
> the nightmare in dependency management perspective. 
> Static order can avoid such complexity.

As long as it's deterministic and keeps things simple I'm all for it.

(In reply to Morrita Hajime from comment #6)
> (In reply to Boris Zbarsky from comment #5)
> > Right.  The only question is script execution order.  Normally we guarantee
> > that scripts in an import run before you keep parsing the importing
> > document, but that's not feasible in this cyclic reference case, right?
> 
> Right. I believe it isn't problem in practice as long as it is deterministic.
> 
> The execution order in general is unclear in the current spec.
> Bug 24565 is to address that problem.

Sounds good to me. Thanks for looking into this.

Comment 9 Morrita Hajime 2014-02-10 23:39:55 UTC

I'm trying to fix this in [1] and following changes by:

- Splitting import list into "import-link-list" and "import-map".
  - The list tracks <link>s
  - The map tracks imported documents for de-dup work.

The import-link-list is basically a flattened and short-circuit DOM tree
that only has <link> (modulo mutation).
It is used to control loading/execution order of the documents/scripts.
It throttles loading so that it guarantees loading happening in tree DFS order.

Here is how the DFS ordering works.
Think about following nested imports:

- index.html
  - a.html
    - b.html
  - c.html
  - b.html

In this case, the loading should happen index-a-b-c order so that
c.html can depend on b.html (in a.html).
The order guarantee makes this possible.
Otherwise, c.html can be loaded before b.html, which is bad.

I believe this determinisity works well for cycle prevention as well as
it serializes the loading order.

This doesn't mean we're going to kill all the concurrency
as it prevents any work for c.html before a.html and children are loaded:
It is fine for UAs to prefetch c.html while parsing loading.html

Closing this so far but feel free to reopen or file new bugs.
Your feedback will be appreciated.



[1] https://github.com/w3c/webcomponents/commit/4c3f3600c359ab06fcb3d6a395dcc4b8ea2bb50f

Comment 10 Gabor Krizsanits 2014-02-13 20:42:56 UTC

Hmmm... the latest version seems very promising so far. I have a few concerns regards of performance, but I think we should focus first on the right behaviour and this version is pretty clear in that respect. I guess the only part I'm not sure is, why do we need to block fetching (fetch readiness) instead of just block script execution in the very same way as the algorithm describes (just with 'ready for script execution' flag in the list). Is there a reason why in your example c should not be fetched/parsed with its scripts blocked before b is done? 

The other part is not entirely clear how all these algorithms in the spec work together exactly. More specifically when should one check for fetch readiness? Every time an import finishes we find the next candidate that can be fetched?

One thing for the map. I wonder if it would make sense to make it a <location, loader> map instead of a <location, import document>. But I guess this is more like an implementation detail...

Comment 11 Morrita Hajime 2014-02-13 21:31:40 UTC

(In reply to Gabor Krizsanits from comment #10)
> Hmmm... the latest version seems very promising so far. I have a few
> concerns regards of performance, but I think we should focus first on the
> right behaviour and this version is pretty clear in that respect. I guess
> the only part I'm not sure is, why do we need to block fetching (fetch
> readiness) instead of just block script execution in the very same way as
> the algorithm describes (just with 'ready for script execution' flag in the
> list). Is there a reason why in your example c should not be fetched/parsed
> with its scripts blocked before b is done? 

I spec-ed that the readiness blocks fetching so that
the shape of the import-link-list deterministic.

Think about

* index.html
  <link id=li0 rel=import href="a.html">
  <link id=li1 rel=import href="b.html">
* a.html
  <link id=la rel=import href="b.html">
* b.html
  <link id=lb rel=import href="c.html">
* c.html
  ...

In this case, import-link-list could have either:

- [li0, la, li1, lb]
- [li0, la, lb, li1]

depends on which of li0 or la fetches b.html. I don't want to have this undeterminisity.

However, I now realized that this shouldn't be a problem in practice. I think this is a spec problem that says loading should be fully ordered by stating that there is a import-link-list.

The spec should say, instead, that loading should be partially ordered. Then we no longer need to block fetching but can only block scripting  as you mentioned. I'd be done by employing other data structure than a list. 

So thanks for the clarification! I'll file a bug to address this.

> 
> The other part is not entirely clear how all these algorithms in the spec
> work together exactly. More specifically when should one check for fetch
> readiness? Every time an import finishes we find the next candidate that can
> be fetched?

Yes, this is kinda reactive thing. In Blink, the readiness of each import is 
re-evaluated each time when any one import finished loading.

I hope there are better way to explain this nature but I have no idea so far. Suggestions are welcome!

> 
> One thing for the map. I wonder if it would make sense to make it a
> <location, loader> map instead of a <location, import document>. But I guess
> this is more like an implementation detail...
>

Yeah, the underlying conceptual model here is exactly what you are suggesting.
I use document as a proxy of loader because we have no such abstraction like loader.