Re: Deidentification (ISSUE-188)

On Jul 31, 2014, at 15:30 , Roy T. Fielding <fielding@gbiv.com> wrote:

> I suggest that, rather than continue trying to mangle the definition
> of an existing term to be more acceptable, we first decide whether
> we want to be using that term in the first place.  In other words,
> assume we have one big requirement up front that says:
> 
>   Data that is noa is out of scope: none of these restrictions on
>   collection, retention, use, or sharing apply when data is noa.
> 
>   [I am using "noa" here in the faint hope that nobody here has a
>   preconceived understanding of that term (it does have one, but
>   not one in English).]
> 
> So, my next question is do we want to define that as:
> 
>   Data is noa if only a small set of people sworn to secrecy
>   are capable of identifying any human data subject observed by
>   that data.
> 
> or
> 
>   Data is noa if it is impossible (as far as we know) for anyone,
>   including those who made it noa, to identify or re-identify
>   any human data subject observed by that data.
> 
> because the former is closer to de-identified and the latter is
> closer to anonymized.

One obvious point is that the data can only be noa if the people sworn to secrecy are also under some constraints about when they are permitted to re-identify and what they can do with the results.  That seems to be also a tar-pit, so perhaps we are, in fact, leaning more towards the second definition. However, there are too many instances in the literature where people claimed or thought re-identification was impossible and were later shown to be wrong, for us to leave it as simply stated as this, I fear.

Thank you for the long email, it’s nice to see us getting to grips with problems like this.


David Singer
Manager, Software Standards, Apple Inc.

Received on Thursday, 31 July 2014 23:20:06 UTC