RE: tracking data (was Re: [TCS] comments on 17 Feb 2015 editors draft)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi Nick,
My main point is we should not just remove the definition (of  “tracking data”) because it is used in several places: third-party compliance, in the definition of “permanently de-identified”, and in Example 4. They all use it in the sense of data that includes the ability to recognise the same individual across multiple network transactions.
I agreed with Roy that the current definition was not very clear and a bit circular, and I suggested different text for clarity, that is all. Given a choice I would prefer the existing text than just removing the definition, and you are right it matters when you read the first-party compliance section to understand what should not be shared. It is very common now to use a UID scoped to the first-party origin and using that to track people across contexts, third-party cookie blocking making it almost de rigueur.
Your idea for a resolution is good, i.e. limiting first-party sharing to permanently de-identified data, but we would have to take the “tracking data” out of that. Making it “original data” might work but that strips out any reference to the central mechanism of tracking, so leaving what a first-party can share still obscure.
Mike




Mike


> -----Original Message-----
> From: Nick Doty [mailto:npdoty@w3.org]
> Sent: 25 March 2015 01:55
> To: Mike O'Neill; fielding@gbiv.com
> Cc: public-tracking@w3.org
> Subject: tracking data (was Re: [TCS] comments on 17 Feb 2015 editors draft)
>
> *** gpg4o | Unknown Signature from 40203EE90BBAB306 1 10 01 1427248527
> 9 ***
>
> Hi Roy, Mike,
>
> >>  2.11  Tracking
> >>
> >>   Tracking is the collection of data regarding a particular user's activity
> >>   across multiple distinct contexts and the retention, use, or sharing of
> >>   data derived from that activity outside the context in which it occurred.
> >>   A context is a set of resources that are controlled by the same party or
> >>   jointly controlled by a set of parties.
> >>
> >>   Tracking data is any data that could be combined with other data to engage
> >>   in tracking a user across different contexts.
>
> From Roy:
>
> > Wait, that's new.  Tracking data is already implicitly defined by the
> > first paragraph, above, to be the data collected when tracking.  I am pretty
> > sure that is how we use it, as well, so we don't need another definition.
> > The above definition changes it to mean the data used to enable tracking,
> > which isn't at all like we are using it.
> >
> > In any case, the definition is incorrect because any data
> > "could be combined with other" (tracking) data to make more tracking data,
> > which implies that all data is tracking data.
>
> I don't think a regular reading of this definition would imply that any data is
> tracking data. The number 42, when combined with data about Nick Doty's
> browsing activity on many sites, does not allow one to engage in tracking me
> across different contexts; it's the data on my browsing activity that's used to
> engage in tracking. I believe no permanently de-identified data would qualify.
>
> Similarly, our definition of "permanently de-identified" includes a clause about
> identifying a user "alone or in combination with other retained or available
> information"; I don't believe that's interpreted to mean that permanently de-
> identified data cannot exist.
>
> > If we need an explicit definition, it could be something like
> >
> >  "Tracking data is any data collected or derived as a result of tracking
> >   that would not have been known without tracking."
>
> When we discussed variations for issue-203 from September onward, the
> suggestion of using "tracking data" was to limit the scope of compliance
> requirements but also to note that sharing data from one context (itself not
> tracking) isn't compliant with a user's preference where it enables tracking by
> someone else. That is, in some cases we want to limit the sharing of data that
> enables tracking even if the collection of any particular datum isn't tracking.
>
> From Mike:
>
> > I do not agree with Roy here about this being redundant. This definition is
> important because it is used in de-identification and clarifying examples. It is not
> simply "data collected when tracking" because it is referring to the specific data
> used for linking, as was discussed when we talked about Gateways, i.e. raw UIDs
> associated with other data e.g. URLs is tracking data.
> >
> > It could be better expressed. How about:
> >
> >> Tracking data is any data that enables a user agent to be recognised across
> different contexts when combined with other data collected in those contexts.
> Examples include cookie UIDs or source IP addresses when collected together
> with targeted URLs.
>
> I think that's Mike's concern with Roy's text: that if "tracking data" is only data
> that comes from tracking, then there would not be any limits on sharing
> personally identifiable data that only refers to browsing activity in a single
> context.
>
> One resolution would be to put limits on sharing of any data from a network
> interaction that isn't permanently de-identified (rather than on "tracking data").
> That would be added to the third-party compliance section, and be similar to the
> sentence we have in first-party compliance about sharing data that a party
> would be prohibited from collecting. In that case, we could delete the above
> definition of "tracking data" and instead refer to "data not permanently de-
> identified" (in the server compliance sections) or just "data" in the de-
> identification definition.
>
> Thoughts?
>
> Thanks,
> Nick
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using gpg4o v3.4.19.5391 - http://www.gpg4o.com/
Charset: utf-8

iQEcBAEBAgAGBQJVEtPuAAoJEHMxUy4uXm2JM3EIAIntXujNbrDsQKxbtpcfCEUE
iP5+KMcxKNPs2gK6HdxbFWpyDicIFKhyGAMpzO2dLiPmLQgt2qj/X0QfJunnsfTH
zm+98uurSsFphutP3OxdTwx/mRsfxFjXHdBLCgzs0wlAAGnRC1DM6X2hyZeI+iLY
gUN/BexdOuKyAxLQf6cjBAtTcMY7AtablR8h0TESxh/Jp+x1rGE6U6AzZMeUvH1/
qFUoTmVq8GCbSmNCCM3ieE4I33TLQCoXcInF9pe0NB44XpJFmWM+/c5TLJo2PdPv
a7Z3vDI1Bpg3vYaGIOqrtC1k5aQ23QHYR2RFiO/WzZCNN3WQ8cF+6AykylhCcfE=
=EDm2
-----END PGP SIGNATURE-----

Received on Wednesday, 25 March 2015 15:28:21 UTC