W3C

- DRAFT -

Tracking Protection Working Group Teleconference

20 Feb 2013

Agenda

See also: IRC log

Attendees

Present
eberkower, npdoty, walter, Thomas, +44.772.301.aaaa, PhilPearce, Aleecia, +1.404.385.aabb, peterswire, Rigo, moneill2, +1.408.836.aacc, hefferjr, +1.202.587.aadd, kulick, +49.431.98.aaee, ninjamarnau, Yianni, Fielding, dsinger, [CDT], +1.202.331.aaff, +1.650.704.aagg, Keith_Scarborough, Peder_Magee, +1.703.888.aahh, [Microsoft], +1.917.934.aaii, vinay, +47.23.69.aajj, ChrisPedigoOPA, AnnaLong, SusanIsrael, adrianba, johnsimpson, +1.202.344.aakk, +1.646.825.aall, hwest, dwainberg, BerinSzoka, +1.215.286.aamm, cOlsen, Dan_Auerbach, JeffWilson, MikeZaneis, +1.202.478.aann, Bob_Ivins_Comcast?, +1.650.391.aaoo, robsherman, Brooks, +1.646.666.aapp, chapell, Chris_Pedigo, +33.6.50.34.aaqq, vincent, RichardWeaver, +1.650.365.aarr, Jonathan_Mayer, +1.202.639.aass, +1.202.478.aatt, rachel_thomas?, +1.650.787.aauu, +1.917.318.aavv, chapell?, Chris_Mejia
Regrets
Chair
Swire
Scribe
moneill2, rigo

Contents


<tlr> trackbot, start meeting

<trackbot> Date: 20 February 2013

<sidstamm> hi all, I'm double booked today and will be on IRC for now but try to dial in later

<npdoty> meeting: Tracking Protection Working Group teleconference

<npdoty> chair: peterswire

<Walter> :-)

<Walter> tlr: it has become self-aware?

<tlr> that isn't novel

<Walter> it still should be open sourced

<peterswire> 404 number is swire

<sidstamm> hi all… regrets for missing the beginning of the meeting. I'll be watching IRC for now but try to dial in later.

<peterswire> ok, i muted until we start. took care of background noise?

<Walter> peterswire: yes, that was helpful

<Walter> peterswire: if you can get a headset at the last minute...

<peterswire> the noise was the chair's effort to eat before the call; sorry on that

<npdoty> volunteer to scribe?

<Walter> sorry, it is bloody hard when you're on skype and not a native speaker

<npdoty> scribenick: moneill2

review Boston work plan

<vinay> zakim mute me

constructive meeting in boston now time for action items

<aleecia> Thanks Adrian, I'm aware. It's just good to have the agenda linked in the minutes properly. It was a hint. :-)

def of service provider 1st - do that later

market research is now 1st

<justin> Richard Weaver

<eberkower> ComScore = Richard Weaver

chris pedigo now on, so back to service provider

Service Provider

service provider works only for you i,e is an agent

controller and processor in US and in EU

<fielding> http://www.w3.org/2011/tracking-protection/drafts/CambridgeBareBones.html#def-service-providers

chris pedigo worked on these issues before, peter has talked to chris about taking this on

<aleecia> (note we do not have liability for controllers due to their processors in the US, a rather important change. I repeat myself, but it appears to keep getting dropped as a rather important issue.)

<fielding> +1 for processors

<Walter> +1 too

<aleecia> -1

peter suggests using European data processor definition

<aleecia> suggests a legal basis we do not have

<dsinger> moving to a more general term might help the "processor for a third party getting consent" (minor) issue

<Walter> hm, good point, another question is how it meshes with the 1st and 3rd party issues

alleecia, EU has different legal regime than US

<rigo> Aleecia: data controller has no liability if the data processor does something wrong

<haakonfb> Agree with Aleecia's points

Peter, US does not have a def. of service provider

<Walter> rigo: that is not true everywhere in the EU, in .nl it is a classic principal-agent relationship

<aleecia> we rejected data processor explicitly, Rigo

susan israel, terms would be allocated by contract in US

<aleecia> and yes, it was long ago

susan, we should not make a block statrement

<justin> Not sure I agree with aleecia's point, but I could see objections if we picked the word "agent" for the specific reasons she articulates.

<rigo> aleecia, we agreed to name it service provider and in the definition we agreed to use the processor definition

dwainbewrg, shares aleecias concerns

dwainberg: blank slate bad idea

<aleecia> ah, I thought you were discussing terms not definitions, Rigo.

<rigo> we said "service provider" and David Singer had text

<kulick> did dwainberg say he believes blank slate WAS a bad idea?>

<amyc> I think he said the opposite

<kulick> i thot so, thx

<dwainberg> no,kulick, good idea

walter, davids concerns valid but do not share them, prefer to have conv. on email. Maybe agent better term

<kulick> ok, thx

<rigo> aleecia, we said "service provider" to avoid offending US feelings, remember? :)

peter: agency law much overlap

peter, chis has agreed to work on this.

<fielding> The term "service provider" is used in a hundred different contexts to mean a hundred different things; it is an awful choice for a defined term. In our context, it is normally used to refer to the either the entity providing user access to the Internet or the hosting provider for a site.

<aleecia> rigo, that's not the summary I would give. :-) But we are remembering the same conversations, including David Singer

<Walter> fielding: +1

<vinay> I'd like to work with chrispedigoopa on the definition

peter: anyone else work with chris?

<ChrisPedigoOPA> sorry, got dropped from the call

<ChrisPedigoOPA> let me know who wants to work with me

peter, chris - what time frame

<aleecia> we ran into issues with different legal regimes. The trick was to find something that works in all, without implying things untrue. We had this conversation even more strongly around using "first party" or not

<aleecia> Near as I can tell, we were at consensus to use "service provider" and we are undoing prior work.

chris, this is defining term right?

<fielding> Using the term "data processor" as it is defined by the EU does not import the EU laws -- it just makes it far easier to know who fits the definition and far less likely that our arbitrary redefinition won't be wrong.

chris, something by next week

<npdoty> ACTION: pedigo to work on updated "service provider"/"processor" definition (with vinay) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action01]

<trackbot> Created ACTION-368 - Work on updated "service provider"/"processor" definition (with vinay) [on Chris Pedigo - due 2013-02-27].

<aleecia> Roy, in general name space collision is a confusing thing

<aleecia> sure

<rigo> "Agent" would be also cool. And some of the trackers are then "secret agents"

<aleecia> rigo++

<Walter> :-)

nick, if we cant resolve now we should hae brief turn on email

<johnsimpson> are we trying to decide what to define or what are we doing?

<fielding> at the W3C, agent is already a defined term

<aleecia> Chris is defining a term we're not sure we should use :-)

<rigo> fielding: Party pooper

<rigo> :)

<johnsimpson> are we saying they are three different things?

what about processor?

<npdoty> peter is looking for volunteers to support: service provider, agent and processor/controller?

<ninjamarnau> I think Peter meant processor instead of controller

<aleecia> We might summarize existing work on the mailing list as well

peter, continue on list

<Walter> moneill2: that's what aleecia and dwainberg are uncomfortable with

<tlr> +1 to aleecia

peter, next item Market Research

<Walter> oh, drat, that was scribing, apologies

<aleecia> Rather than risk losing that to a blank slate approach

<npdoty> +1 to aleecia, tlr, if someone can summarize the past history on the list, that would be great

<johnsimpson> what is Chris defining?

Market Research

peter, current definition too broad

<rigo> johnsimpson: processor

peter: industry say anything can be market research unless otherwise defined.

<aleecia> nick we should likely figure out who has the action here. The two of us or the editors would be good candidates. This is not an open week for me, so I'd rather either have two weeks or better yet find someone else to take it.

<johnsimpson> so are we saying that processer and service provider are the same thing?

peter, no consensus currently on definition

<aleecia> @johnsimpson we're waiting to see if Chris suggests that, we're not saying anything yet

peter, many say DNT: 1 means no tracking of any kind

peter, justin brookman has agreed to work on this with others. Action item?

<Richard_comScore> David and I are working on the MR definition

<rachel_thomas> DAA definition of market research isn't "unbounded." Here is the definition - Market Research means the analysis of: market segmentation or trends; consumer preferences and behaviors; research about consumers, products, or services; or the effectiveness of marketing or advertising. A key characteristic of market research is that the data is not re-identified to market directly back to, or otherwise re-contact a specific computer or device. Thus,[CUT]

peter, david stark not on call

<Richard_comScore> We have scheduled a call with Justin to discuss further

<dsinger> I think it would help to understand what aspects of market research need personally identifiable data, and how that identifiable data can be narrowly scoped in both breadth of data and retention times

<aleecia> Is there someone in {Nick, David Singer, Heather, Justin} up for summarizing where we are on service providers to the mailing list, since this is not a great week for me?

<tlr> justin, were you trying to queue?

<susanisrael> I am also willing to help with the market research definition

<aleecia> walter: market research with DNT:1 makes it a farce, particularly in EU

<aleecia> walter: this was shot down in earlier F2F, in Oct, then long time of silence.

<justin_> I was on the queue for the previous discussion (service provider) but that moment has passed :) I'm sending my concerns to Chris and Vinay.

<hefferjr> I am also very interested in this topic, and would like to be involved.

<dsinger> to aleecia: I am not 100% sure I understand the current state myself; I think the first_party resource plays here, but exactly how I need to understand

<aleecia> … in favor of having this process so if someone wants to bring up a concept, come up with a proposal (missed rest)

walter, dnt:0 OK for permitted use for market research. Definately not for DNT:1 - would make this a farce, thought this had already been discussed, in favour someone needs to come up with concrete proposal

rachel_thomas: daa definition is very valid

<aleecia> @david, ok, here's where it's sold, here's where it's not in prior work is useful. If I take a first pass will you sanity check me? This will need 2 weeks on my side.

<fielding> IIUC, the "unlimited" refers to the scope and amount of data collected, not the purpose to which the data is used.

<vinay> Thanks justin_.

<justin_> "'market research' is . . . analysis of . . . consumer preferences and behaviors" (inter alia) . . . that's not a lot of bounds!

<rachel_thomas> justin, it's important to note the bounds on the definition - NO "sales, promotional, or marketing activities directed at a specific computer or device."

<vincent> "research about consumers, products, or services" seems quite broad to me

mikez: daa def. not unbounded, any permiotted use would be unlimited

<ninjamarnau> to me this definition sounds fairly unbounded. Why not use just anonymized data?

<susanisrael> moneill2, what I think Nick was trying to say is that it was Rachel, not me, speaking to defend the DAA definition

<rigo> I think the major issue is the "identified" vs "identifiable". So pseudonyms seems to be on one or the other side

<aleecia> ACTION: aleecia to summarize texts, agreements, and uncertain bits to data around service providers (ideally with dsinger and perhaps npdoty, if willing) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action02]

<trackbot> Created ACTION-369 - Summarize texts, agreements, and uncertain bits to data around service providers (ideally with dsinger and perhaps npdoty, if willing) [on Aleecia McDonald - due 2013-02-27].

<kulick> rachel_thomas, could you please provide a link to the published definition?

<aleecia> there's no way that's pending review :-)

<rachel_thomas> published DAA definition, see page 10: https://www.aboutads.info/resource/download/Multi-Site-Data-Principles.pdf

mikez: market research necessary for internet economy, this issue is not closed, some discussion had been folded into other language, but need for market reseach remains strong

<kulick> thx

<vinay> Kulick -- http://www.aboutads.info/msdprinciples has the overview + a link to the full text

<justin_> rachel_thomas, sure, but that's not really research! You could prohibit the teasing of otters too, but that wouldn't be a huge limitation :)

<Zakim> dsinger, you wanted to ask about identifiable data

<Walter> rachel_thomas: you're sidestepping that the collection of the data in the first place, regardless of its goal is hard to swallow, especially given a clear opting out signal

davidsinger, if data id unidentifiable then no longer in scope

<Walter> very valid question, if it is unidentifiable it is indeed out of scope anyway

<aleecia> http://www.w3.org/2011/tracking-protection/track/issues/178

<rigo> support for David

dsinger, lets see proposal why identifiable data needed

<dsinger> Given that we consider un-identifiable data OK (out of scope) -- either de-identified or aggregate counts -- I think I need to understand why identifiable data is needed, and how a definition would scope what the data is, how long it will be kept, and how use

<rvaneijk> (... commuting with bad wifi)

peter, good suggestion from david

<aleecia> oops, wrong issue, sorry -

<rvaneijk> unlinked counts as well !

<rachel_thomas> q_

<aleecia> we were here: http://www.w3.org/2011/tracking-protection/track/issues/25

peter, thought is there is a subset of market research where who need identifiers, this could narrow the universe

<npdoty> Kathy Joe from ESOMAR also presented this proposed permitted use: http://www.w3.org/mid/CC930464.12322%25kathy@esomar.org

<rigo> scribenick:rigo

<moneill2> rachel_thomas, not accurate to say self reg codes only followed by academics

<rachel_thomas> that's not an accurate description of how market research self-regulation works. market researchers within companies abide by those same standards across all industries.

<npdoty> proposed actio: weaver to propose narrower "market research" use (with David Stark, Justin, Susan, Ronan)

<hefferjr> hefferjr

<moneill2> peter, anyone else work on this?

<rachel_thomas> I'm happy to participate in that group as well, please.

<fielding> There are no bounds constrained by the DAA definition. Whether it makes sense to have a market research exception or not, we have to be realistic about the implications of data collection that has no limited purpose, no limited scope, and no inherent consent. Actual market research uses consent. This collection is just to select a sample of applicable users (a focus group), which doesn't justify an exception to DNT:1.

<scribe> scribenick: rigo

<dsinger> fielding++

<Walter> fielding: again, +1

<eberkower> Please add Elise Berkower to the list

<johnsimpson> Wasn't there already language proposed on this? http://lists.w3.org/Archives/Public/public-tracking/2012Oct/0089.html

<eberkower> thank you

<moneill2> peter, david stark, richard, jbrookman, susan israel, rachel thoma, chris meija,

<tlr> peter: David Stark, Richard Weaver, Justin Brookman, Susan Israel, Rachel Thomas, Chris Mejia, Ronan Heffer, Elise Berkower to work on "market reasearch" proposal

<susanisrael> yes, I volunteered so that I can call upon the expertise of a colleague who could help

List: Chris Mejia + ?? from Nielssen

<npdoty> proposed actio: weaver to propose narrower "market research" use (with David Stark, Justin, Susan, Ronan, Rachel, Chris_M, EBerkower)

<aleecia> I am curious to know if any prior decisions are expected to carry over, and if so, how we are to know which ones.

<hefferjr> Ronan Heffernan and Elise Berkower are from Nielsen

<moneill2> tlr, thanks

<eberkower> Ronan Heffernan and Elise Berkower from Nielsen

<susanisrael> rigo, I think Ronan and Elise were from nielsen

PS: how is the difference between market research and not just gathering data

<moneill2> peter, lets get that in 2 weeks

PS: will follow up by email with the group

<johnsimpson> WHAT ABOUT THE TEXT THAT WAS ALREADY PROPOSED?????

PS: slightly change the agenda because of speaker available

<moneill2> peter, talk about security matters then return to de-id

<susanisrael> npdoty, I think 2 people are trying to scribe at the same time

<npdoty> ACTION: weaver to propose narrower "market research" use (with David Stark, Justin, Susan, Ronan, Rachel, Chris_M, EBerkower) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action03]

<trackbot> Created ACTION-370 - Propose narrower "market research" use (with David Stark, Justin, Susan, Ronan, Rachel, Chris_M, EBerkower) [on Richard Weaver - due 2013-02-27].

Security

PS: Guest Speaker is John Callas, Security expert. CTO of PGP, later at Apple, security for OS, CTO of intrust, this year new venture

<moneill2> peter, introduces john callas

<scribe> scribenick:moneill2

<johnsimpson> What is the status of this text???

peter, permitted use is essential are, people disagree about duration

<johnsimpson> Issue 25

<johnsimpson> Aggregated data

<johnsimpson> 6.1.1.1 Short Term Collection and Use for market research

<johnsimpson> Note

<johnsimpson> Information may be collected and used for market research and research

<johnsimpson> analytics, so long as the information is only retained for the time

<johnsimpson> necessary to complete the research study. This is providing that the raw

<johnsimpson> information is not transmitted to a third party, the information is not used

<johnsimpson> to build a commercial profile about individual users or alter any

<johnsimpson> individual's user experience, and there is no return path to an individual.

petr, need sense of whats needed in real world

<johnsimpson> A key method for ensuring privacy while collecting and processing large

<johnsimpson> amounts of data is removing any link to a device identifier. Raw data for

<johnsimpson> market research may contain for example an IP address or a marker for a

<johnsimpson> cookie, which may be temporarily retained for sample and quality control as

<johnsimpson> well as auditing purposes. No individual can be identified in the subsequent

<johnsimpson> aggregated statistical report.

<aleecia> Nick - jsyk - updated action-369 (new on this call against me) for three weeks out, since I will not have time in the next two weeks. I still suggest someone else take this one.

<johnsimpson> did phone go dead?

peter, discussion with Rina Mears about auding, will come back in 3 weeks

<rachel_thomas> lost peter...?

<npdoty> aleecia, I don't feel particularly informed about the history of that issue, or would take it

<hefferjr> audio is still good for me

<johnsimpson> lost peter

peter, john callas?

<jon> I am here, too.

<johnsimpson> will call back in

<rachel_thomas> calling back in

<aleecia> thanks Nick, I appreciate that - just a busy time here

peter, john give us a sense of service attacks, length time needed to reatin datra

<susanisrael> *Nick, do you want me to scribe?

<susanisrael> +npdoty, ok, good

john callas, been on both sides, you need both marketing f=data and security data, they should be different,

<johnsimpson> zamik. mute me

<aleecia> confused. what time outs on data?

john callas, way to look at timeouts - time from incident also time you are doing investigation + time after to retain data

jon callas, when does timeout start

<aleecia> ah. speaker is assuming a fixed and short retention period. missed that.

<npdoty> aleecia, I believe Jon is referring to time-based retention limits

<aleecia> thanks nick

john callas, hard to de-identify ip address

peter, how long to people retain ip addresses

<aleecia> solving the problem -> dealing with security threats? or protecting privacy?

Jon Callas, we are more interested in solving problem, not go on for weeks or months, need to collarte them between attacks, how do you manage that?

<vincent> I'd say, it's the first

peter, clickfraud how to manage?

<ninjamarnau> data retention and data sharing are two issues. We should keep these seperate.

Jon Callas, something of a longer time period needed -

<dwainberg> there's impression fraud, as well

peter, 60,90 days, years?

Jon Callas, midpoint

Jon Callas, rule of thumb - bno standard

Jon Callas, depends

peter, how long to resolve incident

<peterswire> david -- I see you, and will look for a break

Jon Callas, cant resolve on same computer - need to do it on network, holding data that is active is reasonable

peter, how long second period

<rvaneijk> (... off to bike home, will try to catch the last part of the call)

<dwainberg> here's my question, if you want to pass it on: what about the problem of identifying and learning to detect problems. In the ad biz you may not understand there is a problem until retrospetive pattern analysis on months worth of data.

Jon Callas, 60-90 days a long period but always exceptions, but often just a few weeks

<aleecia> are we expecting retention limits for first parties as well?

john callas, you would keep summary for to help with next attack

<rigo> if an investigation is ongoing, nobody disputes that you could keep data, rather after end of incident and protocol chatter without default storage incident

Jon Callas, relatively long perios for some attacks, otherwise not needed

Jon Callas, need to separate security data from marketing data

peter, how to separate

Jon Callas, admin controls only

peter, logging., auditing

dwainberg, ad biz has problems other than clickfraud, need to do retrospective pattern analysis

dwainberg, hard to put timeframe on that

<fielding> I cannot underemphasize this … No changes will be made to security data collection or analysis based on the presence of DNT:1. Security is not subject to opt-out (not even in the EU). It is sufficient to ensure that such data is only retained when (and as long as) necessary for the security purpose and not used for any other purpose.

<rigo> but if the user is not tracked, the ad network gets less money, so no real incentive for click-fraud with DNT:1

<johnsimpson> Roy makes an interesting point in IRC

Jon Callas, needs for security to have much data

cookie UIDs or just IP addresses

<rigo> fielding++

?

<amyc> rigo, that is not correct

<aleecia> if your customers think they are overpaying, Rigo, they are less likely to use your business

<Brooks> there is also an aspect of seasonality to data. ESPN.com sees very different traffic behavior in March (march madness) than it will the other 11 months of the year

<vincent> for those interested, a pretty neat paper on various kind of frauds: conferences.sigcomm.org/imc/2011/docs/p279.pdf

dwainberg, some activity is just strange, not fraud but you cant pin it doen, bots, spiders

<aleecia> for seasonality, presumably espn.com has ample non-DNT:1 traffic to get a handle on that

dwainberg, have yet to identify some

<aleecia> (in response to Brooks)

peter, why not keep data for ever?

<hefferjr> but all of the fraudulent bots might turn-on DNT:1 to try to slip through undetected. only analyzing DNT:0 (or other non-DNT:1 traffic) could be very counter productive

Jon Callas, not forever - breach disclosures a problem, so data deleted when upgrades, new tech etc,

<justin_> john callas: risk of data breach can be a forcing function to limit retention. But many in the field believe that Big Data can solve all the problems. Also, the data is less valuable over time.

<Brooks> Aleecia, so if I want to behave badly, I just need to issue DNT:1?

Jon Callas, mobiles very common now

<hefferjr> +Brooks

Jon Callas, 5 yrs too long for mobile

<aleecia> it appears we are having different conversations, Brooks. If you are talking about security, that is a different set of issues.

peter, sub poena

<rachel_thomas> subpoena :)

<aleecia> it seemed you were talking about seasonality which does not seem like the sort of thing you need lots of DNT:1 data for

<justin_> john callas: having to deal with subpoenas/e-discovery is a cost. A deletion policy is one way to mitigate those costs (or aggregation/anonymization)

john callas, ediscovery need policy when to delet, deononymise data

<Walter> rigo: yes, I'm noticing it as well

Jon Callas, nothing is immune to ediscovery request

<justin_> john callas: Security logs are not immune from discovery requests.

<Brooks> not an easy place to have a discussion of the differences between "security" and "quality" and "fraud"

peter, tagging purposes of data - how does that work

<aleecia> fair enough. and my brain is in fog from being sick (again) so if I do not follow, odds are good it is at least primarily my failing

Jon Callas, simple admin controls can do that, we dont share security data for marketing purposes

peter, segregation in databases

Jon Callas, meningless these days - adminb controls enough

<dsinger> "Information may be collected, retained and used to the extent reasonably necessary for detecting security risks and fraudulent or malicious activity. This includes data reasonably necessary for enabling authentication/verification, detecting hostile and invalid transactions and attacks, providing fraud prevention, and maintaining system integrity. In this example specifically, this information may be used to alter the user's experience in order to reasonabl[CUT]

<dsinger> a service secure or prevent fraud. Graduated response is preferred when feasible.

<dsinger> There has been an unresolved discussion on whether "graduated response" should be in the normative text, defined, addressed through non-normative examples, or not included at all."

dsinger, already have definition - have you read it?

<aleecia> David, could you read it?

<aleecia> I think that might help the discussion.

<npdoty> great!

<aleecia> ooh, speaker reading on IRC, cool

Jon Callas, not yest - reading now - np problem with that,

<Zakim> dsinger, you wanted to ask about the text we have

Jon Callas, i am willing to accept that

dsinger, limoited purpose is key

<npdoty> I think we separately note in a section above "no secondary uses" and "data minimization"

<aleecia> Yes, that's a global for permitted uses

<fielding> Note that first party sites often use third parties to estimate security risk based on pattern recognition, which would fall under the general category of "sharing" for a limited purpose.

Jon Callas, logs kept 7-10 yrs would raise eyebroes but 60 dayta or so no problem with that

<justin_> fielding, wouldn't service provider/data processor exception apply in that case?

<dsinger> to fielding: but they do this under a contract, such that results on their data only come back to them? i.e. they are an 'agent'? or is the data merged into a pool that all get benefit from?

<fielding> justin_, no because they don't silo the data -- it is based on multiple site patters

<npdoty> justin, do we need to clarify in "No Secondary Uses" that data can't be re-used for a different purpose, even if that purpose is permitted?

chrism, new use case -c consumer protection taskforce - privy to top security experts - one case is threqt discovered last 6 mo

<justin_> fielding, Hrm. But can individual users or devices be correlated across those databases if they're really just pattern recognition evaluators?

chrism, prosecuter asked how long back attack was happening

<npdoty> justin, so that data retained for a long time for security can't be re-used later for some other purpose?

chrism, so far can go back 5 yrs, prosecuter wants it not only to determine harm but how to punish crims

<justin_> npdoty, Well, if there's an independent and separate exception . . . so what? What's the threat you're worried about?

chrism, over 5 yesrs - law enforcement needs historical info

<rigo> nick, strange, I'm locally muted

chrism, are you familiar?

Jon Callas, yes good to putting bad guys away

<peterswire> restitution

chrism, retribution also imp.

<fielding> justin_, I wouldn't say they are "just" using patterns (this is an extraordinarily NDA'd subject area) -- the purpose is definitely to distinguish bad individuals (or zombies) from good individuals and I am not completely familiar with the techniques used.

Jon Callas, payback imp - but data being held on innocents also important. needs balance

Jon Callas, privacy very important to people]

<npdoty> justin, using years of security logs for frequency capping, market research, anonymizing longitudinal data after years for other purposes...

chrism, some place reasonable - but hard to say where it is

chrism, balanvce - control rather retention

<fielding> … and unlike the ad case, first parties are typically looking for purchase fraud or ineligible buyers (like concert ticket vendors have to prevent market resellers from purchasing all tickets in the first 3 seconds they go on sale)

aleecia, happy with text applied to 3rd parties - is there a distinction betwwen clickfraud and viewfraud

<justin_> fielding, So are you proposing to add "share" to the security permitted use?

Jon Callas, no differenece from security pov, but clicks & views shouold not be kept forever

<Chris_IAB> respectfully, that's a personal opinion for John as a consumer.

<fielding> justin_, yes, though in very limited form "share for the exclusive purpose of security" or something

Jon Callas, retention limited but lock up bad guys

<fielding> … and under NDA

peter, john is committed to privacy and security so useful input,

Jon Callas, dnt important to eveybody security need not diminish privacy

<npdoty> justin, or a dis-incentive to developing any more privacy-preserving techniques for frequency capping, ad reporting, etc. if they can just re-use security data

peter, helpful some commentary on retention versus ?

peter, de-id issue

De-identification

<Chris_IAB> security issue - retention vs. control

<aleecia> truncated uri does not have an issue, either, so far as I know

<aleecia> cannot understand Dan

cannot hear

<justin_> npdoty, Well, that presumes market research as a permitted use! Otherwise, hard to imagine a scenario where the data wasn't required for a while, and then suddenly became required . . .

<Chris_IAB> inaudible

<aleecia> better!

<peterswire> +1 on justin

<Chris_IAB> justin_ that's funny :)

<aleecia> <grin>

dn, have yet to sync up with ed, later this week

<npdoty> ACTION: auerbach to propose text on de-identification (with Ed) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action04]

peter, ed interested in tech steps to de-identify

<trackbot> Created ACTION-371 - Propose text on de-identification (with Ed) [on Dan Auerbach - due 2013-02-27].

peter, rob v eijk and shane wiley had interesting conv.

<npdoty> rvaneijk, are you back?

<rvaneijk> yep, but not on the phone..

<rigo> rvaneijk: can you come to the phoneconf?

<npdoty> rvaneijk, we were just trying to get an update on your conversations with Shane

<rvaneijk> no, Peter has my notes.

peter, any other items?

<Zakim> dsinger, you wanted to ask about de-id: people or the data?

dsinger, de-id means cant identify person

<johnsimpson> Was there an action item on market research?

dsinger, how dos shortening urls deidentify people

<npdoty> johnsimpson, we have an action item on Richard Weaver on that topic

<aleecia> action-370?

<trackbot> ACTION-370 -- Richard Weaver to propose narrower "market research" use (with David Stark, Justin, Susan, Ronan, Rachel, Chris_M, EBerkower) -- due 2013-02-27 -- OPEN

<trackbot> http://www.w3.org/2011/tracking-protection/track/actions/370

<dsinger> ok, so you saying that even when de-identified, it's prudent to do data reduction as well?

peter, if you had smallish bucket then urls may idenytify smaller group and therefore uidentify person

peter, url reduction may ne enough

<aleecia> david++

dsinger, pattern of use if only hostnames

<fielding> sounds like a typical MIT student

<Chris_IAB> those aren't marketing collection categories

<tlr> "stem" = host name?

aleecia, you can fingerprint based on hostames (url stems)

<vincent> it's not enough but it may help, also depends if you have the timestamp

aleecia, needs to be kept as issue

<npdoty> one concern has been that the URL data might *itself* be identifying (even if it's not attached to a real-world device or cookie id)

aleecia, when combined with activity over time, dont know how much but we need to kepp it in mind

<Chris_IAB> aleecia, would that be akin to a "partial print"?

<vincent> interesting paper on that topic: "Why Johnny Can’t Browse in Peace: On the Uniqueness of Web Browsing History Patterns" (http://petsymposium.org/2012/papers/hotpets12-4-johnny.pdf)

dwainberg, primary concerns is what people are reading online - this needs to be pursued

<dsinger> at the moment I am merely puzzled, neither opposing nor supporting, but wanting to understand what's being suggested

<aleecia> Roy - yes! at CMU we researched a hypothetical anthrax attack on the Super Bowl. The FBI must've learned that every two years, there was a week of activity with this homework assignment… it was one of those "why didn't I use Tor?" moments for me.

peter, thanks

<npdoty> thanks all

Summary of Action Items

[NEW] ACTION: aleecia to summarize texts, agreements, and uncertain bits to data around service providers (ideally with dsinger and perhaps npdoty, if willing) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action02]
[NEW] ACTION: auerbach to propose text on de-identification (with Ed) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action04]
[NEW] ACTION: pedigo to work on updated "service provider"/"processor" definition (with vinay) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action01]
[NEW] ACTION: weaver to propose narrower "market research" use (with David Stark, Justin, Susan, Ronan, Rachel, Chris_M, EBerkower) [recorded in http://www.w3.org/2013/02/20-dnt-minutes.html#action03]
 
[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.137 (CVS log)
$Date: 2013-02-20 18:22:53 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision: 1.137  of Date: 2012/09/20 20:19:01  
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: RRSAgent_Text_Format (score 1.00)

Succeeded: s/dwainberg,/dwainberg:/
Succeeded: s/pete, agencie/peter: agency/
Succeeded: s/peter, /peter: /
Succeeded: s/peter, industry/peter: industry/
Succeeded: s/susanisrael, daa/rachel_thomas: daa/
Succeeded: s/mikez, daa def/mikez: daa def/
Succeeded: s/mikez, market/mikez: market/
Succeeded: s/skope/scope/
Succeeded: s/without//
Succeeded: s/supoena (cannot spell taht)/sub poena/
Succeeded: s/johncallas/Jon Callas/g
Found ScribeNick: moneill2
Found ScribeNick: rigo
Found ScribeNick: rigo
Found ScribeNick: moneill2
Inferring Scribes: moneill2, rigo
Scribes: moneill2, rigo
ScribeNicks: moneill2, rigo
Default Present: eberkower, npdoty, walter, Thomas, +44.772.301.aaaa, PhilPearce, Aleecia, +1.404.385.aabb, peterswire, Rigo, moneill2, +1.408.836.aacc, hefferjr, +1.202.587.aadd, kulick, +49.431.98.aaee, ninjamarnau, Yianni, Fielding, dsinger, [CDT], +1.202.331.aaff, +1.650.704.aagg, Keith_Scarborough, Peder_Magee, +1.703.888.aahh, [Microsoft], +1.917.934.aaii, vinay, +47.23.69.aajj, ChrisPedigoOPA, AnnaLong, SusanIsrael, adrianba, johnsimpson, +1.202.344.aakk, +1.646.825.aall, hwest, dwainberg, BerinSzoka, +1.215.286.aamm, cOlsen, Dan_Auerbach, JeffWilson, MikeZaneis, +1.202.478.aann, Bob_Ivins_Comcast?, +1.650.391.aaoo, robsherman, Brooks, +1.646.666.aapp, chapell, Chris_Pedigo, +33.6.50.34.aaqq, vincent, RichardWeaver, +1.650.365.aarr, Jonathan_Mayer, +1.202.639.aass, +1.202.478.aatt, rachel_thomas?, +1.650.787.aauu, +1.917.318.aavv, chapell?
Present: eberkower npdoty walter Thomas +44.772.301.aaaa PhilPearce Aleecia +1.404.385.aabb peterswire Rigo moneill2 +1.408.836.aacc hefferjr +1.202.587.aadd kulick +49.431.98.aaee ninjamarnau Yianni Fielding dsinger [CDT] +1.202.331.aaff +1.650.704.aagg Keith_Scarborough Peder_Magee +1.703.888.aahh [Microsoft] +1.917.934.aaii vinay +47.23.69.aajj ChrisPedigoOPA AnnaLong SusanIsrael adrianba johnsimpson +1.202.344.aakk +1.646.825.aall hwest dwainberg BerinSzoka +1.215.286.aamm cOlsen Dan_Auerbach JeffWilson MikeZaneis +1.202.478.aann Bob_Ivins_Comcast? +1.650.391.aaoo robsherman Brooks +1.646.666.aapp chapell Chris_Pedigo +33.6.50.34.aaqq vincent RichardWeaver +1.650.365.aarr Jonathan_Mayer +1.202.639.aass +1.202.478.aatt rachel_thomas? +1.650.787.aauu +1.917.318.aavv chapell? Chris_Mejia
Agenda: http://www.w3.org/mid/CD4921EC.6EAAD%25peter@peterswire.net
Found Date: 20 Feb 2013
Guessing minutes URL: http://www.w3.org/2013/02/20-dnt-minutes.html
People with action items: aleecia auerbach pedigo weaver

WARNING: Input appears to use implicit continuation lines.
You may need the "-implicitContinuations" option.


[End of scribe.perl diagnostic output]