Tracking Protection Working Group Teleconference -- 12 Oct 2011

<aleecia> (I enjoyed my one year in SF)

<aleecia> Regrets from: Jonathan Mayer

<aleecia> (I have neglected to log those who sent regrets before, and will be better in the future)

<tlr> ScribeNick: enewland

Matthias: Item 1. Minutes from last call are approved

announcements

Matthias: for today the focus will be on the tracking preference expression
... next week will go back to compliance spec and aleecia will chair

<clp> I could not open the minutes link from last time

Matthias: Open action items are: aleecia to look at summary of DNT definition and compliance proposals

Aleecia: still working on that.

<npdoty> clp, there was a corrected link for the minutes sent as an email reply

Aleecia: action for Shane Wiley, to draft text for issues 23 and issue 24

<tlr> I believe he sent regrets

<npdoty> ISSUE-23?

<trackbot> ISSUE-23 -- Possible exemption for analytics -- raised

<trackbot> http://www.w3.org/2011/tracking-protection/track/issues/23

Aleecia: Shane does not appear to be on the call. we are closing this action.

ISSUE-24?

<trackbot> ISSUE-24 -- Possible exemption for fraud detection and defense -- raised

<trackbot> http://www.w3.org/2011/tracking-protection/track/issues/24

<npdoty> ACTION-5?

<trackbot> ACTION-5 -- Shane Wiley to draft proposed text to resolve ISSUE-23 and ISSUE-34 (organizations should commit to blah, must do the following things) -- due 2011-10-03 -- OPEN

<trackbot> http://www.w3.org/2011/tracking-protection/track/actions/5

<npdoty> ACTION-12?

<trackbot> ACTION-12 -- Nick Doty to write a proposal on what DNT means to 3rd parties (for davidwainberg) -- due 2011-10-04 -- OPEN

<trackbot> http://www.w3.org/2011/tracking-protection/track/actions/12

Aleecia: it is premature to close it, with involved people not on the call

<npdoty> dwainberg, any updates?

Matthias: we will leave the Action 5 open
... any update from dwainberg on Action 12?

dwainberg: Thought that was dispensed with on the other call

Matthias: Aleecia and I will follow up over email

<aleecia> JC also not able to join today, I believe

<tlr> action-13?

<trackbot> ACTION-13 -- Thomas Lowenthal to propose a strawman proposal spec for a mandatory DNT server response -- due 2011-10-10 -- PENDINGREVIEW

<trackbot> http://www.w3.org/2011/tracking-protection/track/actions/13

<tlr> http://www.w3.org/2011/tracking-protection/track/actions/pendingreview

<tlr> http://www.w3.org/2011/tracking-protection/track/actions/13

tl: I also have Action 13.

<aleecia> Regrets list is now: JC Cannon, Jonathan Mayer, Shane Wiley

tlr: That is going to be part of the discussion later on in the call

Matthias: the next item is for the editor of the compliance spec, Sean, to summarize the discussion that happened in the past week

Sean: We are summarizing all of the issues and putting them into outline form. Based on that we will try to create an outline for the strawman doc.
... we hope to have an outline of a strawman doc on Friday and then to proceed from there.

<tlr> JC, where are you on action-14?

<tlr> action-14?

<trackbot> ACTION-14 -- JC Cannon to write straw man proposal on response from server being optional (related to Issue-81) -- due 2011-10-10 -- OPEN

<trackbot> http://www.w3.org/2011/tracking-protection/track/actions/14

Justin: do you want me to summarize the comments on the list serve on the first party/third party issue?

<aleecia> I thought the point of this call was a different spec...

Justin: there were competing specs on the first party and third party issues from jmayer and tl
... Jmayer basically said there should be no must obligations on first parties
... tl had a number of should recommendations. That first parties should take certain actions
... it is probably fair to say that for the most part the comments on the list were more along the lines of jmayer's recommendation
... the one caveat was that as Lee and maybe Brett pointed out, first parties should not be able to collect data and then use it to work around the third party prohibition
... perhaps it would make sense to have must not language saying that they must not evade the intention of dnt by using data in unexpected ways

<tlr> issue-89?

<aleecia> Issue: might want prohibitions on first parties re-selling data to get around the intent of DNT

Justin: there were also two specs on third parties. not really competing but differing. Jmayer's was about what happens when third party is using data for analytics purposes. TL had more detailed proposal on third parties in general. There was not much discussion on this, but more discussion on Aleecia's question, possibly ISSUE - 89
... Discussion as to whether this should be about behavioral advertising or collection

<dsinger_> I do feel that if ANY party is continuing to track in the presence of DNT, they MUST respond saying so and why - in this case, I am first party and exempt.

Justin: work on the document will consider those issues and we can have more substantive discussion on the call re: those issues next week

<tlr> trackbot, ping?

TL: one component of first/third party issue. In addition to first party collecting and then providing PII to third parties despite DNT, there is also a Q of whether third parties should only be able to outsource info to other third parties that respect DNT. But this is already captured in discussion on the mailing list

Sean: we are moving forward as quickly as we can on the spec

<trackbot> ISSUE-89 -- Does DNT mean at a high level: (a) no customization, users are seen for the first time, every time. (b) DNT is about data moving between sites. -- raised

<trackbot> http://www.w3.org/2011/tracking-protection/track/issues/89

<trackbot> Created ISSUE-91 - Might want prohibitions on first parties re-selling data to get around the intent of DNT ; please complete additional details at http://www.w3.org/2011/tracking-protection/track/issues/91/edit .

<trackbot> Sorry, tlr, I don't understand 'trackbot, ping?'. Please refer to http://www.w3.org/2005/06/tracker/irc for help

Justin: it was really useful to see Roy's draft spec

<aleecia> that's some slow (re: trackbot)

Matthias: we will continue this discussion next week with Aleecia chairing
... moving on to new business

Tracking Preference Expression, Response Headers

<fielding> related sections in draft: http://www.w3.org/2011/tracking-protection/drafts/tracking-dnt.html#responding

Matthias: any comments on the editor's draft that roy sent around yesterda

Roy: if you see something missing or out of place, let us know

<schunter> DISCUSSION STRUCTURE for RESPONSES:

<schunter> 1. GOALS: What does the response header tries to achieve?

<schunter> 2. CRITERIA: What criteria do we use to assess quality?

<schunter> 3. OPTIONS: What alternative implementations exist?

Matthias: moving on to the discussion of the response headers. there has already been lively discussion on the mailing list. People asked why response headers are useful and asked what options exist for response headers. I suggested a structure for this discussion.
... question of whether you need to reflect the input value or not. And question of how to keep this compatible with existing structure of the web, eg caching
... I propose having three categories of input -- goals, criteria, and options.

<tl> +?

Matthias: For example, a goal would be telling the browser what the server will do. A criteria of efficiency of implementing would be an example criteria.
... Are there questions on the proposed structure?

tl: one commentary, you are referring to very concrete goals. as if we want to know all the possible values of a particular implementation in advance. We need to recognize that innovation we may not know about is an important goal to be considered. Even if we do not know what someone will do with a particular feature, if that feature is amenable to future implementations then we should err on the side of that feature.

Matthias: So you are proposing a criteria that is future-proof and extensable?

<schunter> CRITERIA: extensible / future proof

tl: It should be extensible and future proof

adrian: there may be future uses we don't know about.
... this is something we should consider in the discussion but it shouldn't necessarily trump other considerations

Matthias: need to consider cost of including it

<schunter> Aleecia-GOAL: Transparency reasons: What happened

Aleecia: going back to our Boston discussions. People were interested in having a response header for transparency reasons. Might also help users opt back in for specific companies.
... this way a company can remind users they have opted out.
... there were comments that a response header could be useful for auditing.

<schunter> GOAL: Auditing: WHat is going on?

<rvaneijk> it is useful for compliance as well

<npdoty> I think Aleecia's last "have opted out" refers to "have opted back in"

Aleecia: also there was a sense that we only hold those to a DNT standard those companies who have opted in to being held to that standard
... and a header would help make that happen. so that the assumption isn't that companies are complying with DNT

<jules> jules polonetsky is here,

Ksmith: making sure this is exensible is very important. but it doesn't make a ton of sense to include things that we don't see the value of now. By definition, extensible just means it is easy to add things on in the future. Best to keep things clear and simple now.

<tlr> Note that these were goals from the Boston discussion.

matthias: what is potential use of the header? Aleecia said auditing, transparency, and ensuring that respecting DNT is an opt in for companies

dsinger_: in terms of goals, if a site is continuing to track you, if it thinks it has a valid reason - it is a first party, you have opted to be tracked - the user should know that.
... if the site claims that it is not tracking you then the user has a right to remember that. if it ends up that that isn't true then the user may have some recourse.

jkaran: how many users actually know what a response header is?

<aleecia> (presumably they would not see it, but browsers could build in a UI.)

jkaran: what is the point of talking about a response header when users don't know what that is or will never see it

Matthias: you are saying that the information that is communicated should be useful to the end users or shouldn't be there?

<dsinger> ...my browser might log it; might warn me that site XXXX is still tracking me, and so on.

jkaran: I am just not sure what users would do with this information. If the goal is for auditing or security reasons, that makes sense, but stating as a goal that this information would be in the header for user usage maybe doesn't align with what users can understand.

tl: The response header will be mediated by the browser, etc.

<dsinger> ...agrees with tl; users do not see TLS transaction details, but the browser can warn/advise of its use nonetheless

tl: we are providing the tools at the technical level for the user's agent to interpret it in a way that might be useful to the user. Like an icon

dwainberg: we have talked about use of the response header for auditing in general. I am curious how it would be used for auditing.
... And i would propose that if we are going to have a response header and auditing is a rational for that, then we need to craft that response just for auditing needs.
... perhaps only need a static instead of a logical response
... I am interested to hear what the auditing that people have in mind would look like
... For me, an audit is some process by which you confirm that something is or is not happening

<clp> auditing -- 3rd party research of log data across sites finds ways to see if when claimed behavior is true

<justin> I think jonathan would argue that "auditing" = "enforceability" in this context (Section 5 liability)

Aleecia: to summarize other peoples' points in Boston. As a couple of examples, this would be a way to confirm that those who claim to be honoring DNT are in fact doing so.

<rvaneijk> auditing -- from a user's perspective a mechanism to check whether dnt is being honored or nog

<npdoty> justin, I think those are actually two separate goals

Aleecia: we also heard about third party verification - which trustee was interested in pursuing
... another possibility is it would be useful to know how many sites have DNT enabled

<clp> auditing for adoption rates

<npdoty> +1 on crawling, quantification, 3rd-party documentation of sites that respect DNT

Aleecia: would help with potential FTC evaluations

<dsinger> ...my UA might keep a log of (a) sites that claim to respect DNT (b) sites that do not respond (c) sites that claim to be exempt. I could also build a proxy that does that for e.g. my company

<justin> npdoty, fair enough, but that's the argument I've heard most often from those that want it (I'm not one of them)

dwainberg: if auditing/enforcability is an important rational then it's important to drill down a bit more

<schunter> Roy: we need to create distinct goals by clarifying terms. Each type of audit should be a different goal of the protocol

ksmith: We have already stated that the header is not for the user anyway. The header doesn't do anything that the privacy policy doesn't do

<Brett> Maybe not in a privacy policy, but in a machine readable, standard location, not necessarily in the header.

<npdoty> who said that the header wasn't for the user anyway?

<ninjamarnau> i think a response header offers the possibility of an automated notification of the user, in which way soever

<schunter> OPTION1: Response header

ksmith: we only need it in the response header if we want the browser to act on it

<schunter> OPTION2: well-known location

ksmith: otherwise it could all be handled in a privacy policy

<fielding> Proposal: define a well-known location (URI) on site for machine-readable indication of compliance

<schunter> OPTION3: Human-readable privacy policy

<aleecia> It seems entirely reasonable that a browser would want to react to it, actually

fielding: I wrote down four goals so far. Auditing, transparency, opt in, and measurement.
... we should define a well-known URI where we would have a machine readable doc, such as a json response, that would be a short set of attributes that would be fully extensible - is tracking being used for this site? is tracking required for this site? That would be an easy way to determine compliance, deployment, etc.

<fielding> oops

fielding: the second proposal would be to, instead of using a header field, adding a new status code in http.

Matthias: this discussion should come later.

<fielding> go ahead, I am done talking ... will call back in

Matthias: so why do we want the protocol

<tlr> whooops

Matthias: four reasons, compliance, auditing, measurement, and opt out

clay_opa: Roy's comments mirrored what i was thinking. A machine-readable static file would be best.
... maybe different 200 codes
... the main reason i wouldn't want to completely add to the response header, is that for each and every response sent back, you are adding extra bytes

tl: I think that having an individual response for each user, saying what the site is doing for that user that can then be interpreted by that users browser is important
... that is a goal we should be working toward here. If sites are doing any sort of opting back in behavior, they should be giving that user information about how they are treating that user. So that the user can react appropriately. it's a use case we shouldn't be dismissing.

clp: When i talked to the Chrome guys in boston, they asked what the goal was. I would state it this way, during the transition we also have an educational role.
... imagine a browser that had a paranoid mode or learning mode, where the background of each page could be colored based on how the server was responding. then the user can try to avoid certain pages
... the goal would be to educate the user

<schunter> GOAL is transparency, isnt it?

rvaneijk: I would phrase this as a consent feedback mechanism
... especially in the EU, it is useful for the user to know whether the opt in has been acknowledged or not

<clp> traffic light - red for not respected DNT, green for respecting it

rvaneijk: I work for the dutch data protection authority, but at the moment i speak for myself.

dsinger: the reason to put it in the response is that it is sometimes contextual whether you are tracking or not.

<schunter> CRITERIA: Express fine-grained track/no-track for pieces of a site

dsinger: on the argument about adding bytes - doesn't seem to be a warning. The error codes are orthogonal. Don't want to change 200 or 404 to pack two answers into the same value.

<ksmith> Is it safe to summarize the last several comments as "A header needs to be sent so that the browser has the opportunity to change the user's experience accordingly?"

dsinger: whether you are being tracked and whether the base request is being satisfied are two different issues
... to echo points said before. We have two dynamic issues. FIrst we are dealing with a specific issue who has opted back into certain sites. In other cases, we have a company that is first party in one context and third party in another.
... however, we can have a file that is dynamically generated, that can be a per user, per use contextual piece of information just as a header would be.
... there are things that a browser can do getting a header response that a browser can't do getting info from a text file.

<aleecia> will summarize

Brett: Something interesting surfaced in this discussion. It feels like we are asking the user to check a do-not-track me box and then telling them they will still be tracked in lots of ways.
... if the response is truly contextual or if we believe the tracking site is indeed allowed an exception, then that could be reflected in the header.
... the header could say, i see that you say DNT, but i am tracking you for the following reasons. And to me this is a reason why we want a header. So that users understand what is going on. Transparency.

<aleecia> basically: Tom's point is that the response can vary by user (a given user opts back in, another doesn't) and David's point was that sites can vary (1st party in one case, 3rd party in another) - so a response is contextual. My point was that we can have dynamically generated text files, not just dynamically generated headers. No need for it to be a static file. However, there may be reasons why a browser can do things differently based on header v. known locat

<aleecia> file. That seems the important question.

<aleecia> Tom: you're dropping out

<clp> youe audio breadking up

tl: To respond to Aleecia's comment about the different functionality of the generated page and the header response. A dynamically generated page may not have the level of per request granularity tha tthe header response does.

<dsinger> the snag with a separate transaction is that the server now has to try to work out what the other transaction(s) you are asking about were about

<aleecia> I'm not understanding this in part from drop out

<aleecia> So I'm getting "yes, it make a difference" but I didn't understand why

npdoty: If i had to load a file every time i made a request of a third party, this would hurt our efficiency requirements.

<tlr> tl's argument was that, if things are dynamic, then a well-known location is associated with some sort of possibly tenuous state.

matthias: moving on, the question is once we generate options, what are the criteria?

<clp> some of it was e.g. servers do time-outs and special cases only they understand, dynamically, can say now I am DNT off, not a document choice but contextual to the resonse.

matthias: one requirement is the caching requirement.
... the protocol is good if it doesn't destroy the caching mechanisms of the web.

fielding: You can't have large dynamic sites shut down all cacheabilithy for their sites.
... you need to find ways to mitigate that if you have the response. One way is to have a static machine readable file. The other is to have a separate resource so that if the client wants to knows a domain's policy on tracking before it accesses other resources on the site, it could retrieve that. This is not an expensive operation

<schunter> CRITERIA: caching compatibility

fielding: this has the added benefit of being applicable to third parties. Third parties can say themselves what they are willing to honor re: DNT

<schunter> OPTION: data at well-known URI

<aleecia> (it might make sense for someone to do some quick math as to if there's an efficiency gain from headers v. files, and if so, is it by enough to matter. This seems like the sort of thing we can answer.)

fielding: even moderately busy websites have to process things on the order of 10k requests per second.

Brett: If every response is unique, that could break caching. The scope of this is pretty large. b/c every image in every page could be doing caching.
... but this is largely true because any asset i request on the web, just about, includes a cookie that is unique to me. How do we set unique cookies on a per user basis and not break caching?

fielding: Usually, either there is one resource that sets the cookie, so that all of your javascript files, images, etc do not set cookies

<clp> So why can't that work the same way, most of a page will be from known sources, share DNT?

<aleecia> But cookies are sent automagically?

fielding: the sites are designed to handle this.

Brett: so it is the same issue?

fielding: sites are designed to handle scalability/caching for those particular resources

<Brett> If you delegate to a particular resource, isn't that basically the same as standard location?

Thomas: what about delegating active response to a particular set of resources

<aleecia> +1

Thomas: could we design an approach that is static for static resources and that indicates that there is something dynamic to be found on the tracking resources on the side.

<schunter> OPTION: static headers for elements that never track (like "i am neutral") and then having headers for "I am a tracking element and I accept your choice to not be tracked"

Thomas: it sounds to me as though there ought to be a way to replicate the dynamic cookie pattern in this case.

<clay_opa_cbs> +1 Thomas

<ninjamarnau> good point

fieldingany site used for purpose of tracking is going to be marked as non-cacheabe

<aleecia> I didn't follow Roy's last point

<fielding> I mentioned on the mailing list that caching is not a problem if the response is only made on resources that are already non-cacheable -- like the resources that do tracking.

<aleecia> I would think you could have static content flag itself as static, and still requires a response header.

matthias: the main point is it's not a good idea to require a response header all over the site as this would limit cacheability.

<tl> unmuting

<tlr> fielding, what about redirecting the user to a unique, one-time, cacheable URI each time?

<tlr> there are wrinkles here that I think we need to think about.

tl: if we think we only need a dynamic response for content flagged as non-cacheable then we also need specific policies for content that is cacheable

<fielding> tlr, the redirect would then contain the response

tl: we should be able to prohibit tracking for cacheable content

<clp> Simplicity

Matthias: beside cacheability, are there other criteria for determining whether a protocol is good or bad

<tl> ...and we should require a static "will not track" header on cacheable content

rvaneijk: i am still puzzled. as far as i understand, when a client requests a page that is cached, the server will extend an e-tag to determine if the page has changed and this will still trigger a clickstream
... and if you are serving ads from different websites, it is still possible for user to be tracked across several websites

fielding: i wasnt suggested that tracking not apply to cacheable response. I was saying we don't need to respond uniquely to the client on every response, including those for cacheable content. We are talking about shared caching.

<aleecia> I'm not sure we can know apriori when something is or is not tracking, for all time in the future (or even for today)

<rvaneijk> ok tnx

<aleecia> That was the part of Roy's comment I hadn't understood

<hefferjr> "Shared caching" also describes "Content Delivery Networks" (such as Akamai)

tl: if we have decided that cached content doesnt require a response header than we have failed.
... this means there is a piece of content for which a user can't tell whether or not their request is being honored
... if we go in the direction of saying that cached content does not have an individualized response header, then we should say that cached content MUST NOT be used for tracking

matthias: if we go back to the criteria, then for tl, an important criteria is that users are able to tell if they are being tracked or not. and to be able to tell this from all elements

<justin> I don't think anyone is arguing that Tracking from cached content is allowable in response to a DNT header.

tl: i want to distinguish between elements that have individualized headers and those that say 'this element is never used for tracking'

<tlr> "I'm the home page. I'm not tracking you, but the image over there might."

<tlr> tl, is that roughly the meaning you're after?

matthias: to summarize the options:

<schunter> OPTION: well known URI

<schunter> OPTION: Different types of headers

<tl> tlr, that's more detailed. more like "i'm the homepage|background image|favicon. i'm cached and never used for tracking"

clp: another is fine grained responses

<aleecia> Was Charles

<aleecia> To inject one more point: tracking can be browser fingerprinting. Works fine with static content.

<aleecia> So I don't get why static content *cannot* be tracking, as per Roy

X: another option might be no header response?

<schunter> OPTION: No header/response

<fielding> OPTION: error response status code for indicating "must opt in"

<tl> ACTION: tl to update mandatory response header proposal to acknowledge caching concerns [recorded in http://www.w3.org/2011/10/12-dnt-minutes.html#action01]

<trackbot> Created ACTION-16 - Update mandatory response header proposal to acknowledge caching concerns [on Thomas Lowenthal - due 2011-10-19].

<tlr> OPTION: specific HTTP error response

<aleecia> Isn't this a different issue?

Roy: i want to add an option for a specific error response in http. To indicate that a client has to opt in. You can't use my service unless you opt in.

Matthias: moving on. Technical discussion of how a site knows if it is first or third party. Second, how do you tell people they should opt back in.
... we can discuss either of them

<aleecia> opt back in would be easier

<clp> agree

<jkaran> agree

tlr: let's discuss opting back in. But first, one other quick thing. Tom and Roy should feel free to do a lot of cross review with Action item 16 so we make sure we have something they agree is technically practical

Matthias: I will send an email out kicking this off.
... next meeting is next week. same time, same place.

<clp> laughs

<aleecia> are we adjourned?

<clp> bye

tlr: Register for face to face. Registration is extended until 21st of October.

<ninjamarnau> thanks for extending the deadline

Tracking Protection Working Group Teleconference

12 Oct 2011

Attendees

Contents

announcements

Tracking Preference Expression, Response Headers

Summary of Action Items