This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22699 - Expose the origins of ancestor frames
Summary: Expose the origins of ancestor frames
Status: RESOLVED MOVED
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: Other All
: P3 enhancement
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL:
Whiteboard:
Keywords:
Depends on: 23682
Blocks:
  Show dependency treegraph
 
Reported: 2013-07-16 18:01 UTC by Ian 'Hixie' Hickson
Modified: 2016-12-07 18:59 UTC (History)
12 users (show)

See Also:


Attachments

Description Ian 'Hixie' Hickson 2013-07-16 18:01:52 UTC
On Mon, 26 Mar 2012, Adam Barth wrote:
> 
> For nested browsing contexts, expose the origin of the parent browsing 
> context via location.parentOrigin.  (For non-nested browsing context, 
> the property would null.)

This ended up implemented in WebKit as Location.ancestorOrigins(), a 
method that returns a static DOMStringList with the origins of the 
ancestor browsing contexts in reverse order (top-level browsing context 
last, parent browsing context first). It doesn't respect sandboxing.

See also further discussion:
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2012Nov/0259.html
   http://lists.w3.org/Archives/Public/public-whatwg-archive/2013Jul/0188.html

<jgraham> I seem to recall that people don't like accessing window.location.href in possibly non-same-origin cases because the exception is always reported in the js console in weblinkit browsers, even if it is caught
Comment 1 Ian 'Hixie' Hickson 2013-11-13 20:23:54 UTC
ancestorOrigins doesn't expose the origin of window.opener, which may be an issue, or may not be, it's not clear to me.

Do we have implementor interest on any approach here from other vendors? Do both Safari and Chrome do this now?
Comment 2 Anne 2013-12-13 12:46:39 UTC
What's the use case? Stack inspection is generally a bad idea.
Comment 3 Ian 'Hixie' Hickson 2013-12-13 17:19:01 UTC
Use case of what? checking the origin of your ancestors?

If they're all same-origin, you can relax click-jacking protections, etc.
Comment 4 Anne 2014-01-13 15:14:31 UTC
Isn't the only way to prevent those attacks to not allow yourself to be embedded?

By the way, DOMStringList is dead. This should just return an array if we do this at all. From what I remember with CORS and redirects however we concluded that if you crossed the origin boundary twice we'd just return null and not a list of origins, because the latter has all kinds of issues.
Comment 5 Ian 'Hixie' Hickson 2014-01-13 17:14:40 UTC
Not allow yourself to be embedded by someone else, right. The point is this lets you check if anyone else is embedding you.
Comment 6 Ian 'Hixie' Hickson 2014-10-10 18:23:04 UTC
The WebKit/Blink fork means there's now two shipping rendering engines with this. I guess I should spec it? But they use DOMStringList so it wouldn't exactly match the only implementations...
Comment 7 Niek 2014-10-21 06:01:09 UTC
Mozilla is awaiting the spec in order to implement this: https://bugzilla.mozilla.org/show_bug.cgi?id=1085214
Comment 8 Anne 2014-10-21 07:03:56 UTC
If we do this I would prefer it if we make it asynchronous. Sure there's already a ton of cross-origin/frame APIs that are synchronous, but no reason to continue that approach I think.

  Promise<sequence<USVString>> getAncestorOrigins()
Comment 9 Ian 'Hixie' Hickson 2014-10-21 22:50:49 UTC
Well, what's the advantage of making it async if there's already sync ones?

I'm reluctant to spec something that doesn't match existing implementations. If you think that's a sufficiently better API, I would recommend having Firefox implement it, since then there'd be two APIs for me to spec.

Adam, do you have an opinion on this? (I believe it was you who implemented this originally in WebKit?)
Comment 10 Adam Barth 2014-10-30 20:09:48 UTC
(In reply to Ian 'Hixie' Hickson from comment #9)
> Adam, do you have an opinion on this? (I believe it was you who implemented
> this originally in WebKit?)

I don't see a benefit to making it async.  Even for out-of-process iframes, you need to have an in-process representation of your frame tree, which is easy to decorate with origins.
Comment 11 contributor 2015-01-15 22:15:09 UTC
Checked in as WHATWG revision r8881.
Check-in comment: Spec location.ancestorOrigins
https://html5.org/tools/web-apps-tracker?from=8880&to=8881
Comment 12 Anne 2015-01-16 09:10:38 UTC
Two things:

1) Niek overstated Mozilla's case in comment 7. E.g. bz is not really convinced we should expose more information.

2) Mozilla will not implement this with the DOMString[] syntax. Those are legacy IDL-array objects and we do not want anymore of them. I added a dependency bug where this situation will hopefully be resolved so we can return an actual array here.
Comment 13 Erik Dubbelboer 2015-11-02 14:00:19 UTC
Any update on this?
Comment 14 Philip Jägenstedt 2015-11-02 14:16:45 UTC
Why is this issue still open? Can you (Anne?) file new issues for the specific problems?

FWIW, I'm not overly optimistic about being able to change this away from using DOMStringList in Blink.
Comment 15 Boris Zbarsky 2015-11-02 14:35:36 UTC
> Why is this issue still open? 

Because it's not clear whether this should be a .ancestorOrigins or a getAncestorOrigins(), for one thing.

> I'm not overly optimistic about being able to change this away from using
> DOMStringList

May I ask why not?  Especially if it becomes getAncestorOrigins() at the same time?
Comment 16 Anne 2015-11-02 14:42:16 UTC
Note also that DOMStringList != DOMString[] so either way the current specification is wrong.
Comment 17 Philip Jägenstedt 2015-11-02 15:14:31 UTC
(In reply to Boris Zbarsky from comment #15)
> > Why is this issue still open? 
> 
> Because it's not clear whether this should be a .ancestorOrigins or a
> getAncestorOrigins(), for one thing.
> 
> > I'm not overly optimistic about being able to change this away from using
> > DOMStringList
> 
> May I ask why not?  Especially if it becomes getAncestorOrigins() at the
> same time?

DOMStringList has item() and contains() methods that would go away if FrozenArray<USVString> or any other Array-creating syntax is used. These are the relevant use counters:

item() ~0.23%:
https://www.chromestatus.com/metrics/feature/timeline/popularity/847

contains() ~0.0005%:
https://www.chromestatus.com/metrics/feature/timeline/popularity/849

And one for the ancestorOrigins attribute itself, ~5%:
https://www.chromestatus.com/metrics/feature/timeline/popularity/823

So it's the item() method that ruins it. Joshua Bell (CC'd) has tried to figure out how to deal with this situation, but we're out of ideas for the usual ways. Depending on how terrible DOMStringList seems as a final outcome, I suppose we might try something new, like making the change to find out what breaks. WDYT, Joshua?

As for getAncestorOrigins(), 5% is a lot, so getting the attribute in order seems less disruptive.
Comment 18 Boris Zbarsky 2015-11-02 15:48:54 UTC
> These are the relevant use counters:

Are those use counters for the DOMStringList returned from ancestorOrigins, or for DOMStringList in general?  Because we know that DOMStringList.item() in general is used, but we know that from other places in the platform that use it and are much more interoperably implemented.

> And one for the ancestorOrigins attribute itself, ~5%:

Sure, but since neither Gecko nor IE (tested Edge and 11) implement it, it's somewhat unlikely that those uses are actually critical to those pages.  I'm rather curious as to how it's actually getting used.  Github search at https://github.com/search?utf8=%E2%9C%93&q=ancestorOrigins shows exactly one hit on ancestOrigins, and even that one claims the repo is empty if you follow the link.
Comment 19 Philip Jägenstedt 2015-11-02 16:11:40 UTC
(In reply to Boris Zbarsky from comment #18)
> > These are the relevant use counters:
> 
> Are those use counters for the DOMStringList returned from ancestorOrigins,
> or for DOMStringList in general?  Because we know that DOMStringList.item()
> in general is used, but we know that from other places in the platform that
> use it and are much more interoperably implemented.

The ones I listed are specifically tracking DOMStringList instances created for ancestorOrigins. The other usage in Blink is IndexedDB, and there the situation is the reverse, interestingly enough:
https://www.chromestatus.com/metrics/feature/timeline/popularity/846
https://www.chromestatus.com/metrics/feature/timeline/popularity/848

> > And one for the ancestorOrigins attribute itself, ~5%:
> 
> Sure, but since neither Gecko nor IE (tested Edge and 11) implement it, it's
> somewhat unlikely that those uses are actually critical to those pages.  I'm
> rather curious as to how it's actually getting used.  Github search at
> https://github.com/search?utf8=%E2%9C%93&q=ancestorOrigins shows exactly one
> hit on ancestOrigins, and even that one claims the repo is empty if you
> follow the link.

Even if it isn't critical and even if the attribute's absence is always handled, it still presumably serves some purpose. Is there a strong reason to use a method instead, wouldn't FrozenArray<USVString> work?
Comment 20 Boris Zbarsky 2015-11-02 16:36:50 UTC
> The ones I listed are specifically tracking DOMStringList instances created for
> ancestorOrigins.

Ah, I see.  That's unfortunate; it does suggest that if we keep it as .ancestorOrigins we also have to keep it as DOMStringList; .23% is a bit steep to break with what would be a JS exception in practice.

> it still presumably serves some purpose. 

I used to think web sites were written like that, before I started working on web browsers....

Seriously, though, if we have good data on this I'd love to see it.

> wouldn't FrozenArray<USVString> work?

Sounds like not, given the item() situation.

Note that I'm still not convinced about exposing this information at all, for what it's worth.  Which is one reason I'm curious to see how this is getting used in practice.
Comment 21 Anne 2015-11-02 17:09:48 UTC
Note that service workers also want this exposed, so you can distinguish UI-wise between being embedded on a third-party site and just being embedded on your own site. Though arguably that would not need to expose the entire chain.
Comment 22 Boris Zbarsky 2015-11-02 17:16:07 UTC
Right, I would be much happier exposing a single boolean for that bit of state.  That has very compelling use cases, and you can already get that info hackily in a Window by simply walking the parent chain and seeing if you can touch .document all up it.
Comment 23 Joshua Bell 2015-11-02 17:42:55 UTC
(In reply to Philip Jägenstedt from comment #19)
> https://www.chromestatus.com/metrics/feature/timeline/popularity/846
> https://www.chromestatus.com/metrics/feature/timeline/popularity/848

Aside: these URLs are unfortunately not useful over time, as they link to the N'th most popular feature which is, of course, dynamic. :(

We should fix that. But just leaving breadcrumbs for future readers.

DOMStringList.contains() w/ IndexedDB as source: 0.45%
DOMStringList.contains() w/ Location.ancestorOrigins as source: 0.00075%
DOMStringList.item() w/ IndexedDB as source: 0%
DOMStringList.item() w/ Location.ancestorOrigins as source: 0.225%

All of these appear to be fairly stable numbers, perhaps trending slightly up.
Comment 24 Philip Jägenstedt 2015-11-02 19:09:51 UTC
(In reply to Joshua Bell from comment #23)
> (In reply to Philip Jägenstedt from comment #19)
> > https://www.chromestatus.com/metrics/feature/timeline/popularity/846
> > https://www.chromestatus.com/metrics/feature/timeline/popularity/848
> 
> Aside: these URLs are unfortunately not useful over time, as they link to
> the N'th most popular feature which is, of course, dynamic. :(

No, actually these numbers are the enum values from UseCounter.h, and they are never changed and never reused, except if someone makes a terrible mistake :)
Comment 25 Philip Jägenstedt 2015-11-02 19:37:40 UTC
(In reply to Boris Zbarsky from comment #20)
> > The ones I listed are specifically tracking DOMStringList instances created for
> > ancestorOrigins.
> 
> Ah, I see.  That's unfortunate; it does suggest that if we keep it as
> .ancestorOrigins we also have to keep it as DOMStringList; .23% is a bit
> steep to break with what would be a JS exception in practice.

Yeah, going with DOMStringList would be safe, but a bit sad. There's a blink-dev thread on this that I will poke:
https://groups.google.com/a/chromium.org/d/msg/blink-dev/j23bosJMX-8/ftX4jptNqGkJ

> > it still presumably serves some purpose. 
> 
> I used to think web sites were written like that, before I started working
> on web browsers....
> 
> Seriously, though, if we have good data on this I'd love to see it.

FWIW, loading YouTube and searching the source finds a few instances of ancestorOrigins, so I wouldn't be surprised if randomly checking that, other Google properties and other things that are made for embedding, like other video and maybe Facebook/Twitter widgets would find some representative cases. Minified and unreadable, of course.

> > wouldn't FrozenArray<USVString> work?
> 
> Sounds like not, given the item() situation.

Here I was assuming that the reason for suggesting a method is that it's OK for that to return a new array every time. But yes, FrozenArray<USVString> is feasible only assuming that we could get that item() usage flushed out somehow.

> Note that I'm still not convinced about exposing this information at all,
> for what it's worth.  Which is one reason I'm curious to see how this is
> getting used in practice.

I assume that Adam Barth might know some specific ways in which it is being used.
Comment 26 Boris Zbarsky 2015-11-02 19:48:13 UTC
> loading YouTube and searching the source finds a few instances of
> ancestorOrigins

Just on <https://www.youtube.com/>?  In which source files?  I'm not obviously seeing it, but maybe devtools are lying to me.
Comment 27 Philip Jägenstedt 2015-11-02 21:39:00 UTC
(In reply to Boris Zbarsky from comment #26)
> > loading YouTube and searching the source finds a few instances of
> > ancestorOrigins
> 
> Just on <https://www.youtube.com/>?  In which source files?  I'm not
> obviously seeing it, but maybe devtools are lying to me.

Yeah, at least if you use a Chromium-based browser, after loading <https://www.youtube.com/> you can find matches in three files by searching in devtools:
https://s0.2mdn.net/879366/html_expanding_rendering_lib_200_106.js
https://s0.2mdn.net/879366/iframe_buster_200_106.js
https://s.ytimg.com/yts/jsbin/html5player-new-en_US-vflsLAYSi/html5player-new.js

Because it's minified, you'd have to stare at these for a while to guess what they're for, but they certainly look deliberate.
Comment 28 Niek 2015-11-02 23:27:10 UTC
(In reply to Philip Jägenstedt from comment #25)
> > Note that I'm still not convinced about exposing this information at all,
> > for what it's worth.  Which is one reason I'm curious to see how this is
> > getting used in practice.
> 
> I assume that Adam Barth might know some specific ways in which it is being
> used.

FWIW, ancestorOrigins() is used a lot in ad tech. Systems like DoubleClick, AppNexus and fraud-detection tools like ForensIQ rely on it to detect the top domain where ads are loaded. It's the only way to detect which domain is embedding an iframe. If required I can get these companies to step in this thread and explain how this is critical to their business.
Comment 29 Boris Zbarsky 2015-11-03 02:02:58 UTC
> https://s0.2mdn.net/879366/html_expanding_rendering_lib_200_106.js

Thanks.  So this is in a function that looks like this:

    var Ei = function() {
        try {
            if ("" != m.document.referrer) return m.document.referrer;
            if (m.location.ancestorOrigins && m.location.ancestorOrigins[0]) return m.location.ancestorOrigins[0];
            if (top != m) return m.parent.location.href
        } catch (a) {}
        return m.location.href
    };

Which seems to be trying all sorts of stuff to figure out "who loaded us" even when referrers are blocked.

> https://s0.2mdn.net/879366/iframe_buster_200_106.js

This has the same exact function, with "n" instead of "m".

> https://s.ytimg.com/yts/jsbin/html5player-new-en_US-vflsLAYSi/html5player-new.js

This one is different it has a function that starts like this:


    function YR() {
        var a = m,
            b = [],
            c = null,
            d = null;

and ends like this:

        if (e.location && e.location.ancestorOrigins && e.location.ancestorOrigins.length == b.length - 1)
            for (a = 1; a < b.length; ++a) c = b[a], c.url || (c.url = e.location.ancestorOrigins[a - 1], c.Ho = !0);
        return b
    }

where b contains instances of |new WR(c)|.  So this is stashing some state in the return value based on ancestorOrigins.  Earlier c is set to things like e.location.href, e.document.referrer, etc, where "e" starts off at "m" and then goes up the .parent chain. In general, looks like it's just gathering up information about whatever it can in terms of the chain of embedding documents and shipping it out of the function.  What it needs this for is not obvious.

> explain how this is critical to their business.

Critical but can't be done in any non-WebKit browser, right?  ;)

Seriously, my two issues with that use case (which I will assume has to do with figuring out whom to pay how much for the ad correctly) are:

1)  It's not clear to me that as described it requires exposing the whole ancestor chain.

2)  It's not clear to me that as described it needs to work without cooperation from the toplevel page in question.
Comment 30 Erik Dubbelboer 2015-11-17 05:51:05 UTC
YouTube is probably using this to detect fake traffic like described in post: https://news.ycombinator.com/item?id=10572585
Comment 31 Philip Jägenstedt 2016-03-16 11:43:40 UTC
location.ancestorOrigins now uses FrozenArray in the spec:
https://github.com/whatwg/html/pull/862

That has no bearing on the issue in comment 29, of course.
Comment 32 Jon Guarino 2016-04-25 18:33:57 UTC
With regards to comment 28 and comment 29, the lack of this feature basically helps facilitate ad fraud:

https://www.comscore.com/Insights/Blog/Domain-Laundering-Emerges-as-Significant-Threat-in-Digital-Ad-Ecosystem

It's almost trivial to implement such a setup that's basically undetectable in browsers like Firefox and IE, and there's not really any other API designed to protect against it.

In terms of point 1 from comment 28, I guess you could argue that just the top domain is sufficient, but it's not clear to me that the perpetrator of such a scam would have to be the top level page. It would certainly leave a sizable gap through which variations on this technique could develop.

And I think the answer to point 2 about top-level cooperation is obvious in this context.
Comment 33 Anne 2016-12-07 18:59:26 UTC
This is now https://github.com/whatwg/html/issues/1918 as far as I can tell.