RE: ISSUE-16, ACTION-166: define (data) collection

David,

I believe you've hit on a key issue "existence vs. use".  While this is debated, many on the industry side believe it's more appropriate to focus on use than on mere existence.  In my opinion this is a key divide in the debate for Permitted Uses vs. Unlinkability.

- Shane

-----Original Message-----
From: David Singer [mailto:singer@apple.com] 
Sent: Friday, June 01, 2012 2:10 PM
To: Roy T. Fielding
Cc: Rigo Wenning; public-tracking@w3.org; Bjoern Hoehrmann
Subject: Re: ISSUE-16, ACTION-166: define (data) collection


On Jun 1, 2012, at 11:13 , Roy T. Fielding wrote:

> On Jun 1, 2012, at 10:21 AM, Rigo Wenning wrote:
> 
>> you were complaining about the fact that being "exposed" to data is often 
>> interpreted as "collection".
> 
> Actually, I was complaining that the current definition in the
> compliance spec is wrong because it says that collection happens
> when I am merely exposed to data.  Nobody in the real world
> equates exposure with collection.

I agree.  I still prefer 'is exposed to' (passively comes to me), 'collects' (I took active steps, such as running a script or doing a database lookup), 'retain' (I held onto it after the transaction request-response was complete), and 'share' (I passed it to another party).

I am not sure 'use' really concerns us; the privacy concern is, I think, the existence of the database. Anyway, why bother retaining if you never use? The whole point of retained data is to use it.

>> A definition that is tautologic doesn't solve 
>> your issue. A definition that imports hairy problems of identification into 
>> the definition isn't buying you peace either.
> 
> We know what "data" means without a specific definition in the
> standard.  We know what "collection" means without a specific
> definition in the standard.  The only reason we might want to
> define "data collection" in the standard is if we have something
> more specific in mind than the mere conjoining of those two
> words would explain.  For example, if we want to apply
> requirements to a specific sort of data being collected (PII)
> or a specific mechanism of data collection (cookies) such that
> readers are not confused by all the other forms of data collection
> that are not constrained by this standard.

Yes. We could usefully have a definition of the subset of all data that is of concern to us, I rather think.  I would not include aggregate counts in that definition, so then we don't need an exception for stuff that should be out of scope in the first place.

In the past I have used "Personally Derived Data" (PDD) i.e. individual records that derive from a person's interaction (whether or not it's currently identified to a specific person or not).


David Singer
Multimedia and Software Standards, Apple Inc.

Received on Friday, 1 June 2012 21:15:16 UTC