Re: Deidentification (ISSUE-188)

On Jul 23, 2014, at 9:42 , Roy T. Fielding <fielding@gbiv.com> wrote:

> On Jul 23, 2014, at 9:32 AM, David Singer wrote:
>> and to capture the thought in process, what I was floating on the call just now was to add:
>> 
>> A data set is considered de-identified when:
>> a) there exists a reasonable level of justified confidence that none of the data within it can be linked to a particular user, user agent, or device;
>> b) and the creator of the data-set commits not to re-identify any user, user-agent, or device that contributed to the data;
>> c) and the creator either restricts recipients from any such re-identification or accepts responsibility for any such re-identification.
> 
> Hmm, well, I still don't think it is necessary to include that in the
> definition.  I think we should have a separate requirement on publication
> or sharing of tracking data, rather than twist de-identified into an
> implied behavioral requirement.  After all, "42" is de-identified
> regardless of any later commitments or efforts at re-identifying.
> 

I understand your hesitation and share some of it.  However, I feel that
* de-identification has been defeated often enough that we cannot be sure people will always succeed
* a user who is harmed should be able to work out who has responsibility: someone who defied a restriction on the data, or someone who made it available without that restriction.

There are, alas, enough people out there who would try to engineer a situation in which it appears no-one is responsible ("we did our best to make it de-id’d”, “no-one said we couldn’t try to re-id”) that I think we need to close that chink somehow, formally.

David Singer
Manager, Software Standards, Apple Inc.

Received on Wednesday, 23 July 2014 17:23:12 UTC