18:17:09 RRSAgent has joined #dntb 18:17:09 logging to http://www.w3.org/2013/02/11-dntb-irc 18:17:24 fielding has joined #dntb 18:20:17 vinay has joined #dntb 18:20:28 hefferjr_ has joined #dntb 18:20:42 Joanne_ has joined #dntb 18:21:30 jmayer has joined #dntb 18:21:46 wseltzer has joined #dntb 18:29:48 kulick has joined #dntb 18:31:09 robsherman_ has joined #dntb 18:34:26 BRB 18:35:09 npdoty has joined #dntb 18:36:06 Zakim, code? 18:36:06 sorry, npdoty, I don't know what conference this is 18:36:11 Zakim, this will be 26632 18:36:11 ok, npdoty; I see Team_(dntb)18:30Z scheduled to start 6 minutes ago 18:36:14 Zakim, code? 18:36:14 the conference code is 26632 (tel:+1.617.761.6200 sip:zakim@voip.w3.org), npdoty 18:36:54 i am hearing nothing on the phone 18:37:46 okay... just making sure... thx 18:38:08 thanks Nick 18:39:34 dwainberg has joined #dntb 18:39:54 now being heard 18:40:03 yep 18:40:20 hello 18:40:30 hello 18:40:58 Paul_G has joined #dntb 18:41:14 marc has joined #dntb 18:42:35 can someone scribe this? 18:42:37 npdoty, can you type the questions in irc when you get a chance? 18:42:46 “Lifetime browsing history” is a phrase that is often used, but never defined clearly. What would LBH mean as a technical matter? 18:42:52 In light of this definition, what technical measures would suppress or delete LBH? 18:45:44 tlr has joined #dntb 18:45:56 Is someone breathing heavily into their microphone? Could everyone please check if they are? It is making it difficult to hear on the phone. 18:46:08 Zakim, who is making noise? 18:46:08 sorry, npdoty, I don't know what conference this is 18:46:13 Zakim, this is 26632 18:46:13 ok, npdoty; that matches Team_(dntb)18:30Z 18:47:08 back 18:47:18 scribenick: npdoty 18:47:34 endpoints vs. isp's or browsers or browser plugins 18:47:42 dwainberg: what the concern is? 18:48:39 jmayer: do we have ambiguity on that point? 18:49:09 Are we being tasked to define a hypothetical scenario? 18:49:45 jmayer: the URLs that a user has visited 18:49:46 chadhage_nielsen has joined #dntb 18:50:23 "URLs the user has visited"? Does that include third-party URLs? Does it include single-site knowledge vs cross-site knowledge? 18:51:00 jmayer: difference between URLs on a single site, and URLs across multiple sites 18:51:16 dwainberg: kind of new to focus on browsing history 18:51:34 I don't think there's anything new here. The EFF/Mozilla/Stanford proposal focuses extensively on linkability of user activity. 18:52:27 not relevant to this discussion 18:53:35 s/not relevant/current discussion on phone is not relevant/ 18:55:01 npdoty: a definition could be: "list of URLs a user visited on multiple sites" 18:55:20 so, LBH is collection of URIs visited over time beyond the scope of a single first party? 18:55:59 peterswire has joined #dntb 18:56:02 #dntc 18:56:16 marc: but for a large first party (AOL owns many different publications), that party might know URLs I've visited on Huffington Post and other publications 18:56:51 rrsagent, make record world 18:56:59 jmayer: we've made progress on first vs. third party, can we agree with that? 18:57:04 room: yeah 18:57:15 paul: is the harm the transfer to a third party? 18:58:14 assume there is no harm and solve the technical issue for the sake of not having to meet forever 18:58:39 jmayer: had the questions on harm already 18:58:49 npd: repeat of our high-level questions 18:58:49 wouldn't LBH mean = the collection of all URLs the user visits (spanning all sites). 19:00:17 let's assume lifetime == more than the current browsing session and less than browser product lifetime 19:00:18 marc: retention policies, and minimization policies 19:01:15 agree with vinay 19:01:17 dwainberg: different amounts of time vs different breadths of sites -- not sure there's a quantitative limit 19:01:31 -Jonathan_Mayer 19:02:00 dwainberg: domain vs. full URI; retained in a linkable or unlinked form 19:02:51 npd: could we accomplish business cases with just the domain? 19:03:29 The parts of the URI that are needed to retain depends on who is doing the collecting. 19:03:56 BillScannell_ has joined #dntb 19:04:30 npd: minimization of reducing just to domain (rather than path or parameters) could help with privacy concerns 19:05:05 dwainberg: limit to a legitimate business purpose (not disclosed publicly but to an auditor) 19:06:08 npdoty, I'd be shocked if folks who think "what you read" is private would be willing to accept domains as "private enough". 19:08:13 ronan: want time limits on retention in addition to amount of data collected 19:08:54 dwainberg: but that could fix the maximum too high 19:09:37 fielding: most concerns are about domain viewing, rather than page viewing 19:10:01 ... technical: not save the data 19:10:11 ... 2) cryptographically hash the data 19:10:21 ... ... strong enough to not be easily broken 19:10:39 ... ... save categories/buckets associated with a URL, rather than the URL itself 19:12:15 peterswire has joined #dntb 19:12:46 dwainberg: for many businesses, it's true, but ability to target depends on the time collected 19:14:30 ... converting URLs into interest categories can be done in a very short time 19:14:46 ... reporting might need the domain or path for a longer period of time 19:15:34 vinay: some ad reporting requires proof of the negative -- that a given ad did not appear next to a competitor or on a "bad" site 19:15:48 (adding some notes, to be sure it's clear) 19:16:42 For targeting purposes, most 3rd party biz models, have limited need for full URI to be retained -- some 2 secs, some 2 days, some not much longer. For targeting only, there is not a long term need for the URIs. 19:17:15 However, for measurement, billing, etc, there is a longer term need to retain URI, or at least domain information. 19:18:57 npdoty: could retain full URI but not retain a user identifier 19:20:26 ronan: frequency capping might be a case that requires user identifer 19:20:43 dwainberg: attribution or conversion tracking 19:23:02 peterswire has joined #dntb 19:26:05 dwainberg: what is the harm? if it's data breach, then that requires a different set of solution 19:26:31 it sounds like what we are saying is that the mechanical means to suppress LBH will have to differ based on the purpose and timeframe of permitted use 19:27:02 npdoty: concerns identified have been multiple: data breach, government access, malicious use, or just the presence/retention: why does this site know that about me? (trying to give a very brief summary) 19:31:04 Another way to look at it … one can disassociate LBH by either 1) reduce data collected about BH; or, 2) remove association of BH with the user/agent/device 19:31:25 +1 to fielding 19:32:53 Are there any compelling use cases for retaining detailed browsing history beyond a general time limit on retention? 19:32:59 If so, how would you limit those use cases consistent with the goals of: (1) limiting LBH; while (2) enabling “buckets” or “low-entropy cookies”? 19:35:55 defining browsing history: URLs (including domain, path, parameters) across multiple sites beyond a session (or request?) 19:36:46 Leave it as a question: what would the user find as an acceptable lifetime for their BH? Browsers keep 14 days. 19:38:55 fielding: common default configuration of a browser is keeping history data for 14 days 19:39:25 ronan: but cache could potentially be a lot longer 19:39:40 dwainberg: we should be vague about the length of time 19:40:40 ronan: history might also refer to the content 19:41:02 schunter has joined #dntb 19:42:03 npdoty: similarly sensitive profile in the ads that I've seen, not just the articles I've read online 19:43:38 I am not following the freq capping use case -- it does not mean that you keep a list of every ad seen 19:43:43 ronan: need to keep a list / history of all the ads he had seen 19:47:44 dwainberg: data isn't kept in a single list, would have to join multiple tables 19:47:57 npdoty: does that make a distinction for a user concern? 19:48:38 dwainberg: less likely for an attacker to breach multiple tables/databases at the same time 19:49:02 ... reduce the concern if you have good internal operational controls 19:50:17 its 3:15 19:53:26 dsinger has joined #dntb 19:54:08 paul: "lifetime browsing history" a scarier term than our scoped definition 19:54:43 ... correctly captured the two general techniques (reducing data, or de-identifying) 19:55:53 ... users using multiple devices 19:56:06 ... most third parties don't have a Web-wide breadth 19:58:09 paul: can draw a bright line, but depends on purposes 20:00:06 peterswire has joined #dntb 20:03:13 zakim, who is making noise? 20:03:24 fielding, listening for 10 seconds I heard sound from the following: hefferjr (53%), +1.617.253.aaaa (35%) 20:03:39 zakim, mute hefferjr 20:03:39 hefferjr should now be muted 20:04:32 suppressed LBH 20:05:31 it isn't quite the same as de-identification since there is still some potential of identifying within the context (e.g., user sends their own name in submit) 20:08:12 zakim, unmute hefferjr 20:08:12 hefferjr should no longer be muted 20:10:03 third dimension of time, keeping a history for only a minute or only a hour might satisfy suppressing lifetime browsing history 20:11:20 regarding technical measures, can suppress on any of those three 20:11:59 hash by site, hash by campaign, hash with limited lifetime salt 20:12:11 deleting data (addressing time); reducing specificity of data (so that you have less than "domain"); removing association to a user (either "de-identified" or aggregation) 20:15:35 dwainberg: use cases -- targeting is easier under suppressing history than financial reporting 20:17:37 fielding: can hash a user to a campaign rather than having a full list of ads seen by a user? 20:17:40 more expensive to process identifier when hashed by campaign, but the process is trivially parallel (meaning it can be done at scale) 20:17:47 ronan: more computationally expensive to do so, though 20:18:07 dimensions: data -> full URI; domain & path; domain; extracted category data 20:18:25 assocation -> uid; de-id*; aggregate 20:18:32 time -> [continuous] 20:19:02 okay 20:19:05 -kulick 20:19:06 -vinay 20:19:06 -Joanne 20:19:08 -hefferjr 20:19:08 -Fielding 20:19:09 thanks 20:19:09 rrsagent, draft minutes 20:19:09 I have made the request to generate http://www.w3.org/2013/02/11-dntb-minutes.html npdoty 20:19:54 - +1.617.253.aaaa 20:19:55 Team_(dntb)18:30Z has ended 20:19:55 Attendees were vinay, Jonathan_Mayer, hefferjr, kulick, Joanne, Fielding, +1.617.253.aaaa 20:24:36 dsinger has joined #dntb 20:37:40 rrsagent, draft minutes 20:37:40 I have made the request to generate http://www.w3.org/2013/02/11-dntb-minutes.html npdoty 20:37:44 rrsagent, make logs public 20:37:48 rrsagent, draft minutes 20:37:48 I have made the request to generate http://www.w3.org/2013/02/11-dntb-minutes.html npdoty 20:43:58 dsinger has joined #dntb 20:56:01 peterswire has joined #dntb 21:09:05 dsinger has joined #dntb 21:16:49 schunter has joined #dntb 21:33:15 schunter has joined #dntb 21:38:47 rrsagent, draft minutes 21:38:47 I have made the request to generate http://www.w3.org/2013/02/11-dntb-minutes.html tlr 21:54:27 peterswire has joined #dntb 22:04:56 schunter has joined #dntb 22:19:38 dsinger has joined #dntb 22:36:49 Zakim has left #dntb