11 Feb 2013

See also: IRC log




<hwest> Hey all - waiting on the phone connection before we get started

<hwest> Anyone on the phone?

<tlr> we can do skype if there's anybody remote.

<tlr> Right now, I'm not sure there is anybody

<rigo> RvE: red orange and green logs

<rigo> red have lots of identifying and precious information e.g. TomTom with geolocation stuff in it

<rigo> in TomTom, they had id and changed it to pseudonymous ID within 24 hours you could translate and after 24 hours, they will break that link

<rigo> unlink data and maintain data that can be only de-identified on behavioral patterns

<rigo> so 1/ de-identification

<hwest> Anyone on the phone?

<rigo> 2/ and manage the risk of re-identification

<rigo> wanted to have speed average information in aggregate, did not want to throw data away

<rigo> SW: de-identification being bundled, technical operational and administrative

<rigo> ===

<rigo> RvE: de-identification by ?? In order to have it work. Can only work if the time the link is active is limited. To get from orange to green you have to de-link it

<rigo> ... if you want to list all the people used, it will get very difficult. Establish de-identification by throwing away the SALT

<rigo> ... wanted to proove that red-orange-green works well. Full URI == red

<rigo> ... URI + link to identifiable data == orange

<rigo> ... throw away the link == green

<rigo> JeffWilson (JW): did you do final analysis and result

<rigo> RvE: did so after investigation, was able to quantify the risk, risk was low

<rigo> SW: APEC most stringent things in Japan and South Korea. Most what we discuss would satisfy APEC

<rigo> RW: so finding consensus on something that satisfies Art. 29 would allow us to play everywhere

<rigo> SW: yes, would have worded it differently, we have to try to find common ground

<rigo> RvE: color scheme, green is mostly unintersting to DPA as normally also accompagnied with safeguards

<rigo> HW: hashing throw away the SALT, how often should the SALT be rotated?

<rigo> RvE: if you do not throw away the SALT, it remains in orange

<rigo> SW: as soon as you break the SALT to chunks. In the chunks we do not want to be prescriptive.

<rigo> RvE: agree, depends on the purposes of processing

<rigo> RW: what does that mean?

<hwest> Jeff: all contextual as to what the right period here would be or when something carries personal information

<rigo> FW: only for permitted uses, we have already identified concrete purposes there. Can have different things. Web analytics could be days, some traffic measures need less

<rigo> HW: recapitulates discussion to peterswire

<rigo> SW: we haven't created consensus on security. There we can not do de-identification. There retention periods are longer and not de-identify

<rigo> ... there you would rely on technical operational and administrative measures

rob: if you want to store data because you're worreid about click fraud

it would be good to just store the data you need

Rob: look at history of data retentio ndirective in EU

there was long discussion about what was necssary to retain

and it was separate from how long

rob the whole question of you acutally need to accomplish your goal is a relevant one

what i don;t hear in the dnt discussion often is taking account what i really need to acomplish that

Jeff: Rigo mentioned AOL data incident

there was small number of people xposed with a data set that large

<rigo> RvE: Security is not a green card to do whatever. Clickfraud e.g. long discussion about data retention Directive showed that we needed for 6month - 3years. Relevant discussion. Real need and necessity for security data collection should be discussed

Even PHI released for public analsis and research is a higher treshold

I want to make sure that if we're saying that there is something to learn, that it was far under the radar

that should count as deidentified data.

Rigo - m,y point is procedural.

Theoretically in the fufutrre decrypting could take one sefcond

the standard we are doing can't provide this - we can only have a momentarily how to fix this for now

so that it's workable

because we are having in my opinion

we are sitting between 2 extremes - do nothing and run the cart in the wall

how do you do usable privacy with concrete technical hints

You need to go into the red-tape field and ask technicians

our client is the average website provider

I'm concerned that we identify rules that requre certification, then that's the end of it

Shane - I agree

with that sentiment

deID is not a simnply problem.

All concepts of DeID - we have today's staet where there is none. Typically, small to medium size comapnies don't know what to do with that

They don't try to address the data

They probably don't even know what they have

Asking them to rotate keys, do admin controls, etc. - small companies won't know what dto do

They can just delete the data.

They can choose not ot implement DNT

or I do think that this will spawn a new bujsiness and there will be fcompanies

that igve you a server plugin that will do deidentification for server users

I think it's iomportant to keep small and medium size businesses in mind

Rigo - we have a good ujnderstanding of what we could do

I would, on the risk of creating some kind of error, I would compare this to envrionemnt moment

where 30 years ago they were laughed at

and today this creates a billion dollar business.

If we are laying boudnaries for businesses, we should be aware of it. This should be workable for businesses.

Shane - we have Small Biz at Yahoo

we host hundreds of thousands of sites today

There are store systems, payment systems, etc.

Rigo - where is the pain point

Heather - The pain wil lbe around waht is appropriate

Let's note down that we have agreement on Rob's plan - w

we needto drill down on Rob's approach for three color plan

Heather - we need more context

Rigo - we can go to the main grroup about Rob's appraoch

We think this as an approach that coudl be supported

the trouble is now defining how the links would be cosntructed - how they would be thrown away.

How is the risk assesment done?

Heather - i would go further

For now, we can work with the hash, throw away the salt technical mechanism,

we shoujld talk about thje policy

Rigo - I am against that because once you get a tech solution that everyone agrees to, policy people will bicker and wash away the tech side

Heather - in my mind, we are here to ediscuss the policy stuff

talk through the definitions

'if we are going to address lifetime browisng histroy, then what is the policy that shoudl be implemented

Becaues otherwise, there will be differences

Rob - it's about accountability -

<rigo> RvE: time is accountability

Shane - a lot of people see DeIDed and delinked to be the same

<rigo> SW: don' t like colors, need new names

depends on your definitoion

<rigo> => discussion on names for the three buckets we have

Rob - there is a misconception in the US about what is anonymous and pseudonymous

<rigo> RvE: we should not talk about anonymous/pseudonymous

<rigo> stage 1 /2/ 3/

<rigo> JW: raw, transition, de-identified

<rigo> red == raw data

<rigo> orange == obfuscated

<rigo> green == de-identified (that includes de-identified event data and completely aggregated data)

<rigo> orange still change

<rigo> line between orange and green is whether it is still considered personal data anymore. If re-identification is too hard, it goes green

<rigo> red: all of data of red exists also in orange, replace all identifiers with a lookup table. In rotating hash with SALT as orange. You still have the knowledge of SALT and key. Can link in your domain

<hwest> Ok, so raw event data -> managed data -> deidentified data

<rigo> orange == managed data

<rigo> managed data: all characteristics of raw plus some new

<rigo> JW: we could have permitted uses depending on what kind of data is the object. So Security should remain raw. Once it is managed, change requirements

<hwest> Is anyone on the phone?

<rigo> RvE: start with raw. managed, de-identified and then talk about permitted uses

<hwest> Rigo: Several axes - one is time and one is restrictions and one is sensitivity

<hwest> Rigo: The more sensitive it is, the more restrictions we apply.

<rigo> SW: managed means striping

<rigo> RW: what stripping is not clear yet

<rigo> HW: some data will remain high level (gives scoring example)

<rigo> RvE: this is augmentation of identity data

<rigo> RW: We have a rather clear understanding of raw and de-identified, but we have a range of possibilities and hash them out

<rigo> JW: orange data could be stripped to have it less rich than the raw data you got.

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.137 (CVS log)
$Date: 2013-02-11 21:39:27 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.137  of Date: 2012/09/20 20:19:01  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

No ScribeNick specified.  Guessing ScribeNick: RichardatcomScore_
Inferring Scribes: RichardatcomScore_

WARNING: No "Topic:" lines found.

Default Present: Thomas
Present: Thomas

WARNING: Fewer than 3 people found for Present list!

WARNING: No meeting title found!
You should specify the meeting title like this:
<dbooth> Meeting: Weekly Baking Club Meeting

WARNING: No meeting chair found!
You should specify the meeting chair like this:
<dbooth> Chair: dbooth

Got date from IRC log name: 11 Feb 2013
Guessing minutes URL: http://www.w3.org/2013/02/11-dnte-minutes.html
People with action items: 

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.

WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report

[End of scribe.perl diagnostic output]