Re: ISSUE-16, ACTION-166: define (data) collection

On Jun 1, 2012, at 11:13 , Roy T. Fielding wrote:

> On Jun 1, 2012, at 10:21 AM, Rigo Wenning wrote:
> 
>> you were complaining about the fact that being "exposed" to data is often 
>> interpreted as "collection".
> 
> Actually, I was complaining that the current definition in the
> compliance spec is wrong because it says that collection happens
> when I am merely exposed to data.  Nobody in the real world
> equates exposure with collection.

I agree.  I still prefer 'is exposed to' (passively comes to me), 'collects' (I took active steps, such as running a script or doing a database lookup), 'retain' (I held onto it after the transaction request-response was complete), and 'share' (I passed it to another party).

I am not sure 'use' really concerns us; the privacy concern is, I think, the existence of the database. Anyway, why bother retaining if you never use? The whole point of retained data is to use it.

>> A definition that is tautologic doesn't solve 
>> your issue. A definition that imports hairy problems of identification into 
>> the definition isn't buying you peace either.
> 
> We know what "data" means without a specific definition in the
> standard.  We know what "collection" means without a specific
> definition in the standard.  The only reason we might want to
> define "data collection" in the standard is if we have something
> more specific in mind than the mere conjoining of those two
> words would explain.  For example, if we want to apply
> requirements to a specific sort of data being collected (PII)
> or a specific mechanism of data collection (cookies) such that
> readers are not confused by all the other forms of data collection
> that are not constrained by this standard.

Yes. We could usefully have a definition of the subset of all data that is of concern to us, I rather think.  I would not include aggregate counts in that definition, so then we don't need an exception for stuff that should be out of scope in the first place.

In the past I have used "Personally Derived Data" (PDD) i.e. individual records that derive from a person's interaction (whether or not it's currently identified to a specific person or not).


David Singer
Multimedia and Software Standards, Apple Inc.

Received on Friday, 1 June 2012 21:10:14 UTC