Privacy Interest Group Teleconference -- 29 Oct 2015

<trackbot> Date: 29 October 2015

<wseltzer> present=AxelPolleres(observer), colin, hadleybeeman, dka, kodonog, rigo, twhalen, wseltzer

<twhalen> Waiting a bit to get all necessary parties into the room before starting...

<hadleybeeman> scribenick: hadley

<hadleybeeman> scribenick: hadleybeeman

fingerprinting guidance

<twhalen> https://w3c.github.io/fingerprinting-guidance/

npdoty: Summary of the document.
... This defines browser fingers. Capability of a site to identify or reidentify a user based on some characteristics about the user agent or their device.
... It takes into account as many features as possible, so all features/functionalities contribute to the same problem.
... So it would be useful to provide guidance across W3C to coordinate on this.
... This doc has been reviewed by you all, and by the TAG.
... We've had some outside feedback.
... Recent changes: based on TAG feedback, to emphasise what can/can't be done.
... The TAG (Technical ARchitecture Group) has a finding on unsanctioned tracking.
... We've seen this is use on the web... maybe by header enrichment, or fingerprint tracking, super cookies and various other local stored objects

<wseltzer> TAG on Unsanctioned Tracking

npdoty: Basic finding was that this was harmful to users
... We have actual mechanisms for enabling state on the Web, like with cookies.
... Using these unsanctioned methods give users less control and can break the same origin policy.

dka: The user control point is the main issue there.
... The sanctioned methods allow hooks in the browser to clear the state.

ndoty: They also made the point that there is little we can do, technically, about this problem.
... We can't just forbid a small set of features and stop this happening.
... This document now tries to elaborate that there are things we can do to mitigate the problem.

<Zakim> rigo, you wanted to ask about hasec

Rigo: In the Security discussion, it came up that harmful security features would be used for fingerprinting.
... My response was: before getting any credential, you have to access this kind of API.
... Prove what is behind this thing. So it's not obvious that this contributes to fingerprinting more than other things.
... That's the first question#
... Should we take hardware security and web crypto APIs into account?
... Second: If the browser exposes lots of info to the server,
... this is a passive way of fingerprinting, but this is an assumption that the browser doesn't lie to the server or inject arbitrary white noise.
... Has this been explored?

npdoty: On the first question: Hopefully we can apply this document's best practices to the hardware APIs.
... I'm not deeply familiar with web crypto, but I think it has strong protections in place to prevent cross-origin fingerprinting

Rigo: There is a specific provision in the Hardware Security charter to mitigate XSS and injection attacts
... To limit the scope and availability of that interface
... They will continue to ask

npdoty: Hopefully this document will help

wseltzer: Would PING be interested in work to document the same origin model in privacy? To answer the question in hardware security
... of what happens when we want to expose the same identity across multiple site interactions?

rigo: Is it limited to cross-site? or does it address also fingerprinting?
... First and third parties are confusing here

npdoty: This document addresses the narrowest first-party case. We assume consumers can clear cookies and be forgotten.
... But in unsanctioned tracking, the lack of control is a bigger concern

rigo: Which in turn triggers a legal discussion of how long you may retain data, as a service
... Because if you can't remember the profile... Enforcement-wise, it's much weaker, but this is hte logical consequence

npdoty: Are there people interested in exploring that cross-origin concept?
... Did that come up in other groups this week?

wseltzer: I was picking up on npdoty's conversation about hte implicit privacy models of the web, following security models of: do we get users building up expectations based on segments to origins?
... That should be something that the Web helps to cement

npdoty: Yeah
... It might be useful to describe some of the concrete privacy cases
... I (and some of my friends) use a private browsing window to look up sensitive things, or for testing sites (logged out)
... So we assume that the things we do when in private browsing or in a separate browser are not connected to what they are doing elsewhere

rigo: If you have a VPN... I would say "whoa, what an expectation"

twhalen: There is a lot of confusion about what private browsing means

rigo: I'm not too sympathetic to those definition discussions
... Roy Fielding has discussed for four years what tracking means. To no avail.

npdoty: It might be more useful to talk about use cases

rigo: yes.
... Legal people tend to make a definition and then attach meaning to it. Use cases are a better method

npdoty: So those are the recent changes to the document.
... 4. Feasibility: you can decrease the surface area/add entropy to the sytem
... Esp detectable fingerprinting -- been discussion on the mailing list.
... Even if we can't prevent fingerprinting, if we can make it clear to researchers, regulators etc that a specific site is fingerprinting,
... then we can use measures outside of technical measures in the browser to address the problem
... Best practices are the actual recommendations in this doc
... Not sure yet if they're all equally useful
... [reads the best practices from the document]

<Zakim> hadleybeeman, you wanted to ask about research methodology

hadleybeeman: How did you come up with these and is there anyone who uses fingerprinting involved in this?

npdoty: We did research, stuff from browser vendors including TOR who have noticed these things. Researchers find what's possible.
... This TOR document goes into a lot of detail. A full list of what could be used for fingerprinting.
... To your second question, I don't think we've had a lot of participation from people who are using fingerprinting

wseltzer: We could invite Dan Kaminsky to review this. They've just joined.
... To propose Iron Frame. Their work is around distinguishing bots from genuine traffic for advertisers.
... They are doing things that might look like fingerprinting but might be interested in talking with us about how to optimise for privacy

npdoty: Back to the doc... More examples might help
... I've tried to collect some of the research, but there is more out there.
... Lots of people have sites you can try to see if your browser is unique. Apparently it's a fun site to build.
... Current status of the doc: still open issues, but no more gaping holes.
... We've asked the group if they are interested in publishing this as a draft of a group note
... Not sure when we'll wrap up some discussion

twhalen: Another round just went out on this. The response thus far have been positive, as we're looking for consensus.
... I expect there will be more time for additions, but there's not reason not to put it out there

wseltzer: So should we publish it now?

twhalen: Fine with me

npdoty: I think it's ready

twhalen: We've done two rounds of approval

dka: I'm channeling slightlylate, who says we want to focus on specific technical guidance.
... I think he wants to see more technical guidance in the doc for spec developers on how to mitigate these issues
... Have you been discussing this with Alex or Mark Nottingham?

npdoty: I would love to hear more on the question

rigo: ask him whether browsers would implement a threshold where if a site makes 50 requests that don't make sense in the context of the page, they should prompt the user

npdoty: I think it would be valuable, like how the security questionnaire has become more valuable, with specific examples of what working groups have faced and how they responded

<rigo> http://techcrunch.com/2015/10/28/fundamental-rights-vs-self-regulation/

npdoty: We could do more with these, for every best practic
... We just got this feedback from research on canvas fingerprinting... they crawled the web looking for companies that were doing it
... They found that the canvas API made it easy for them to do that

<rigo> He also stressed that the call for the development of better user control technologies is something the report says “can only happen with clear guidelines and legal interpretations from regulators”.

dka: That's perfect. Very good technical feedback
... other feedback from mnot: he's happy to help on this
... Generally, you have the support of the TAG. Don't want to overstate any concensus, but we would like to see this doc have as broad a review as possible
... Esp by those writing web APIs inside and outside of W3C
... So we can help make noise about it

npdoty: Part of the motivation for publishing a draft note is clarifying this doc's standing and completeness

dka: Yes, that would be good. Guidance can evolve

keiji: I think this is good work. When I look at it, I wasn't clear if it was about tracking or fingerprinting
... Also it seems to mix user tracking and browser tracking
... Do you have a definition or clarity on that?

npdoty: Can you explain what the difference is for you?

keiji: Some people may share the same browser.

npdoty: Good point

keiji: Also, the fingerprinting is just one way of tracking. If we care about tracking, we might need to talk aobut other things.

npdoty: That's useful feedback
... I think we do want a narrow scope, but there are lots of things that manage state and can be used for tracking on the Web
... This document isn't about tracking generally
... We're not saying this will prevent all unsanctioned tracking, just this one capability

keiji: So browser fingerprinting?

npdoty: Yes
... The user, the user agent, or the device... Sometimes we assume they are all the same
... Maybe we should remove 'user'?
... Some techniques measure how you type, for example. That's user behaviour.
... And this might give you guidance about that

keiji: I thought about that. Any identifier about the browser -- if this is about fingerprinting, we don't have to mention @@
... The persistent unique identifier isn't like hardware/device, for example

npdoty: Yes, lots of APIs are introducing ways to store an identifier. So we need to give advice that when groups create a new one...
... We need to have that written down. Maybe in this doc, maybe elsewhere.
... Moving to open issues in github...

hadleybeeman: Can add another one? The title of doc sounds to me like it should be guidance on how to fingerprint

npdoty: Can you recommend an alternative?

dka: Counter fingerprinting? Unsanctioned fingerprinting?

rigo: Hacking fingers off?

npdoty: It's a real question... Anti can turn people off us.
... We could use 'mitigation'?

dka: that's a good word

hadleybeeman: I just don't want it to be misleading, and people to miss it when they need it

<dka> “Mitigating Fingerprinting in Web API Design”

<dka> (an O’Reilly book)

hadleybeeman: +1 to dka

npdoty: Are there other issues, group?
... If people want to take an issue and issue a pull request, that'd be great.

<npdoty> http://www.w3.org/mid/CAC1M5qq=wHYHa_x5it67_fDNuL2hYh2PQdhaBKuP4+F-ByJ+Kw@mail.gmail.com

npdoty: And for now... words "correlating" "linking" "identifying"... Are there better words?

rigo: in the EU privacy disucssion, they use the term "Singling out". You don't actually need to know their identity -- but you can single them out and discriminate.
... The reason to have Data Protection laws is to avoid discrimination. Or at least, undue discrimination

npdoty: Fingerprinting isn't always identifying to someone's name. A lot of it is whether some party can connect disparate activities (like private browsing and non-private browsing)

rigo: It's a slippery slope. At some point, you can apply the theory of Shannon and Weaver.
... Because you'll have a large aggregated pile of data, and at some point you'll be able to connect that to the individual
... Sweden has calculated the risk of identification

dka: I think "correlate" is fine

hadleybeeman: Why was this an issue initially? People concerned it might be confusing?

npdoty: Not sure. Multiple people thought it could be simplified

dka: Linking and identification are also loaded terms

hadleybeeman: In data, linking has a lot of baggage
... and plain English is useful here

dka: I think "correlation" works

rigo: If they still complain, you can move to "single out". Pure "correlation" is a bit more than that
... But I could live with "correlation"

<rigo> hadleybeeman: correlation leads to exploitation, it is beyond threat model. It is explaining what outside the standards world this document is explaining

<rigo> ... not endorsing though

hadleybeeman: You could explain what the fingerprinting -> correlation -> identification -> discrimination exploitation process does. As context. To then establish how the threat models interface with what we're doing in the browser or web standards.

keiji: What would be the main problem you're describing in "correlation"?
... The linking itself is a problem. Correlation is a result of that.

npdoty: We tried to have these different threat models, for example "this could be used to identify you".
... That can lead to things like physical violence.
... There is another range of people being frustrated or unpleasantly surprised when they see the same info about themselves on different sites.

keiji: The situation itself is just disclosure of browser activites.

rigo: You can deduct lots from browser activity. Geolocating somebody and killing them, which happened in NY where a frustrated husband killed his wife because the browser was geolocating her... it's a different activity
... You have to be careful not to be too abstract.

keiji: In that sense, "correlation" may be too wide

npdoty: It's just not the URLs you visit, but that you can connect data you put in to one site to activity elsewhere

rigo: fingerprinting provides a state machine

npdoty: So that's the doc, at this point

rigo: But you don't have active mitigations, and passive mitigations in this

npdoty: What would be active?

rigo: Browser creating artificial entropy

npdoty: So there are two things there.

rigo: in many of the non-mainstream browsers, you can mask as another browser. This is used to circumvent the build of ad-blocker thing. Javascript detecting ad-blockers and showing you something else

npdoty: 1. TOR Browser has a good explanation of why that won't help. If you could randomise a value, you could just nullify it for everyone. #

rigo: If you have value a at site 1, and value b at site 2, then I'm fooling the algorithm
... TOR is making the mistake of looking at the absolute security and then sacrificing your relative security over the absolute one
... Which is a common mistake

npdoty: I'm not sure they are.
... I think their math is right

rigo: My suggestion is to inject, not to manipulate the value that exists but to give them additional values

npdoty: there is some academic work on this

rigo: Yes, they tell me anything is deanonymisable, esp by quantum computers

npdoty: I've talked to the academics and the TOR implementers, and they both disagree. I'd suggest you read what they've written before you decide if they're wrong.
... Also, once you can detect fingerprinting, you can imagine a browser that just sees it and stops it happening
... throwing errors or sending random stuff to the site

rigo: which would kill online banking.

npdoty: This is an implementation issue. I think TOR is willing to do this... if it sees you making enough calls for fonts, it stops returning results.
... But I don't think that's something we should standardise

<npdoty> hadleybeeman: everything we do could break something, but we shouldn't avoid improvements for that reason

hadleybeeman: Ultimately, they're looking to us to build a better Web anyway

npdoty: I'll follow up on the mailing list for implementations, and Rigo will look into anonymisation
... I'd be curious to hear

rigo: In this area, sometimes you have this pseudo-security where it's actually the opposite. So it's worth verifying.

keiji: I have thought similar things, re injecting random variables.
... The problem is we intentionally inject wrong value, but if some problem happens -- we are responsible for that.
... So usually computer systems don't tell lies.

rigo: My computer lies all the time
... It translates my conscious decision to lie to the DNS system
... For example by setting all the trackers we found in the Prime Live project to 127.0.0.1

<rigo> primelife.eu

<rigo> 127.0.0.1

rigo: which is effectively an ad blocker without being an ad blocker
... The discussion we have now: this collides with copyright protections and anti-circumvention rules.
... Now they say, if you don't load their javascript... then you are circumventing their efforts.
... But I claim that none of their protocols require me to accept everything
... Perhaps we should ask the TAG about this
... The court in Hamburg just decided that you are legally obliged to take everything the server serves

hadleybeeman: That sounds like something the TAG should discuss

npdoty: [reviews actions]

twhalen: [break until 10:30]

<rigo> https://www.spiritlegal.com/de/urteile/beschluss-lg-hamburg-308-O-375-15-urheberrecht-bild-de-axel-springer-vs-eyeo-adblockplus-eV-95a-UrhG-adblocker-verboten.html

<rigo> LG Hamburg decision on ad blockers

<mnot> +present

<npdoty> scribenick: npdoty

Privacy/Security Questionnaire(s)

tara: a little background on privacy/security questionnaire

<scribe> ... done some work on privacy considerations for spec-writing

UNKNOWN_SPEAKER: but how might we set up guidance for spec writers in general, what should you be thinking about?
... so that when people come to PING for advice, have a set of questions they've thought through already
... and mkwst had already done some work on a security questionnaire, with interest from the TAG
... so have pulled out the more privacy-relevant sections of the questionnaire, to flesh out that we can use for a more privacy-specific work
... volunteers from CDT (Greg and Joe) who have been trying it with documents, including the Presentation API
... experience on the mailing list @@

http://www.w3.org/mid/CAMJgV7Z=tCbMJC1d3FAP9rtQjxZCv_yE3Grw5Q3VNi+J4VtYOA@mail.gmail.com

tara: is there a privacy considerations section? don't mandate that kind of thing, but have discussed it
... as has been done in some contexts for security considerations
... beneficial to explicitly include such a section in the document
... 2) "personally derived data"
... with perhaps some debate over that definition

npdoty: could we just never use a defined term like "personal information", "personally identifiable information", "personally derived information"
... just leads to fights over whether something falls in that category, when it would be more useful to just provide examples or a general descriptor ("information about the user or their environment")

rigo: legal debates between jurisdictions about identifiability
... uncertainty about what qualifies to "identify a person"
... instead, more useful to describe what "singles out" someone

rigo: hard to discuss concepts based on the system of terms

mkwst: at a high level, look at the purpose of the document
... +1 to npdoty that using these terms is problematic, because they're ambiguous, and can have a different meaning in different contexts or different jurisdictions
... more important is the idea that puts into the head of the spec writer
... dropping it from the document doesn't change the meaning of the document, because other questions get at all the constitutive parts, like sensor data or other kinds of data that might be interesting

"interesting data"

<rigo> Mike explains data life cycles, that are partly (nicely) covered in the document

mkwst: using terms that people are going to argue about doesn't support the goal of the document

<rigo> goal == purpose of data processing

mkwst: the security questionnaire with the TAG had a purpose of being given to someone writing a spec, and was still already too long
... and then you can come to experts and get feedback on those questions
... this document looks like it might be too detailed for the purpose of giving to non-experts
... but could still be a useful tool for the experts in trying to do review

rigo: for spec-writing, could just ask questions about the data lifecycle
... questions about generation, retention of data cover the important debates

mkwst: not sure that applies for people who are working on their own feature, if categories are too broad, more likely to be missed

<rigo> data life cycle can have arbitrary levels of hypertext

mkwst: specific questions like "do you use sensor data?" rather than "do you collect data?"
... has been helpful inside Google to identify issues in product teams

<Zakim> npdoty, you wanted to include Joe Hall's comments on mkwst/audience point

mkwst: isn't always clear that you're collecting data

<twhalen> npdoty: Joe and Greg (CDT) were working on this topic. Joe thinks this questionnaire is more useful for PING to use for reviews.

<twhalen> npdoty: but also we, as PING, might be able to add details that will help others (non-experts) as well

<twhalen> mkwst: in ideal world, questionnaire would be more fleshed out, and linked to a document like the one PING has right now, with more details.

rigo: here we're addressing data protection more than personal privacy
... data lifecycle is less complex than it might be, because people can answer what they are collecting, why they are collecting it and how they might re-use it
... can of course provide links to more information

mkwst: specific concern is that "data collection" means different things in different contexts
... for example, providing a sensor might sound like you're not collecting data, because you might be enabling collection elsewhere
... that is, different people will have different understandings of the same feature
... we want to give an easy introduction, which is helped by examples and pointed questions

rigo: what would I do as the implementer vs. a separate party who is thinking about the risk of a particular feature

mkwst: +1
... would be helpful in trying to get a simple document for people who don't already care
... I'd like to understand what the minimal set of things would be
... and whether we agree on minimal set being the right goal
... want something shorter/simpler than the document I authored
... I like the idea of having a framing mechanism of a data lifecycle, in order to assess risk

tara: a TAG questionnaire to be the shortest possible form, with links out to the more detailed document for people who really care and want to know more

mkwst: things I like: description of the threat models
... and specific examples of what other specs have done to mitigate a particular threat

<twhalen> https://w3ctag.github.io/security-questionnaire/

mkwst: I think we need more mitigation strategies
... I like having the threat models
... but don't think we have the right questions yet
... was useful in the Google context, in terms of trying to get earlier review

<twhalen> npdoty: I used this questionnaire with the Manifest spec

<twhalen> npdoty: sometimes you can answer the questions but not know why

<twhalen> Different people might come up with different answers for "origin"

<twhalen> Geofencing example -- had different responses to same question

<twhalen> npdoty: became an abstract example; "expose data to an origin it currently doesn't have access to"

<twhalen> npdoty: might be hard for group to *know* what to do with an answer

<twhalen> npdoty: maybe in those cases give link to the more detailed document?

mkwst: my goal would be to get this down to a one-pager that they can skim over
... and have a separate expert version

tara: 1-pager should cover both security and privacy?

mkwst: yes, very similar.
... I am looking for someone else to do the work ;)

kodonog: to be clear, we're not going to go through each question right now

mkwst: what would you ask during a five minute conversation?

mnot: I'm not sure what we can communicate in five minutes, or 1-page document
... if they're asked, do you have high value data, they'll just say no

kodonog: is there any way we can communicate that this is more complicated than you think

mkwst: last sentence can be pointers to more complicated document and suggestion to talk to PING or other groups
... people often haven't thought about security/privacy more than just a recommendation that they should think about them
... no single document is going to replace the kind of work that an expert can do

drogersuk: possibility of willful ignorance, for people who are concerned about opening a door

mkwst: requiring a priv/sec section would be useful, because if it's empty, we know that that document needs to be reviewed/audited

npdoty: even if we want to encourage doing privacy throughout the document, it's still useful to have a summary section

drogersuk: doing privacy/security throughout an entire document can actually make the conversation more difficult

mkwst: can the TAG tell us what would be helpful when they ask for a review?

dka: some people can be in a state of denial. have a set of points you have to think about makes it easier
... if someone has a list in front of you, you're more likely to realize

mnot: having it recorded in the document that a group keep state on what they've thought about for privacy/security is useful

mkwst: saying there are no considerations is basically a challenge

dka: I think more people are coming to the TAG for JavaScript API design, but I would like us to be able to do more
... having a 1-pager would be useful, iff you can follow up with the details

kodonog: done is better than perfect
... concept of a questionnaire has been around for a long time. the short document might be a way to kickstart activity

mnot: people can have plausible deniability, would like to get it out more officially
... we want tools so that we're not a bottleneck/single point of failure for reviews

mkwst: if you were a bottleneck, I would be happy that everyone was going through you for reviews

dka: putting a bunch of information out there so that people are primed

mnot: have shied away from making things required, want something more self-help style
... having it out there and popular is the carrot, the stick would be requiring a section even if it's not a particular format

<dka> http://w3cmemes.tumblr.com/post/132186476482/the-fingerprinting-guidelines-may-need-a-new-name

<mnot> +1 to new name

rigo: for people who are fully in denial, no questionnaire can help
... making it a nice hint for people to read a 1-pager about data lifecycle, the expectation that questionnaires will lead to more privacy in specifications is overly optimistic

dka: nothing we can do to force people not to ignore. but aren't there people who will be informed/improved? mitigation, rather than solution

rigo: think we need technological leadership rather than questionnaires

kodonog: there is a category that would do the right thing if they had better tools
... there is a category that thinks they already understand the problem, but might respond to feedback/pushback
... and there is a category where we can't do anything, who will avoid

dka: +1

rigo: an example in web/internet of things
... had a long history of using the sticky policy paradigm
... but this isn't well-explained, a questionnaire for implementers who don't know what to do won't improve the design
... shouldn't believe that it's _the solution_

kodonog: all agree that it's not the only solution

rigo: just want to be careful about the standing

<Zakim> npdoty, you wanted to comment on frank dawson proposal

<twhalen> npdoty: earlier Frank Dawson had a document, didn't get a lot of uptake but was a start; had groups do a sort of modelling like a life-cycle

<twhalen> npdoty: if we say, perhaps, there's a one-pager with five questions, answer them and then come to us; think through the threat models, data lifecycle...

<twhalen> npdoty: and then later we may have a more detailed list

kodonog: experts in particular areas need pointers to quick resources

keiji: security/privacy issues are a matter of decision-making
... people who are aware can make more clear decisions
... sometimes we have clear solutions, sometimes we don't
... so that some group can make their own decision
... but where we have clear solutions, we may be able to do something more pro-active

kodonog: if you talk to people earlier in the process, they'll be more receptive to change
... late in the process they're likely to be quite resistant

drogersuk: possibility of having active opponents to a more privacy-preserving solution

tara: seem to have general consensus on the short/sweet version
... would be a useful first pass, supported by mkwst, the TAG and others around the table
... PING is happy to take some work on for that first pass, what are the most important points
... although of course we need people to weigh in who aren't thinking about this all the time
... the longer, more robust document also working on, also needs to be matured
... can continue some conversation at IETF next week

<twhalen> npdoty: logistics of working with TAG in docs?

dka: bcrypt in the TAG working on it, but could use someone with more involvement

npdoty: would be weird to call it a "finding"

rigo: need to be clear about the terminology, to avoid unnecessary controversy

<schuki> mkwst

<twhalen> npdoty: can we get something formal in place via director about mandating privacy & security considerations sections?

dka: TAG agreed to publish it as a Note at their Boston F2F
... it's in the minutes!1!

rigo: if it's more privacy-specific, should it be a PING document, with then TAG review?

dka: open to different configurations

npdoty: came from a security context, which is how it ended up at the TAG, which noticed that it touched on multiple groups' interests

hadleybeeman: interested in the value of the document to put more TAG weight behind it

<hadleybeeman> hadleybeeman: Not in taking any ownership from PING

tara: don't want to step on anyone's toes, trying to figure out who should hold the pen

kodonog: could encourage Greg/Joe to work on it separately, get some maturity, and then bring it back to the TAG

dka: but keep it in Github so that all the state is gathered together

(which might involve mechanical forks, but the goal is not to fork into entirely separate directions)

tara: but CDT would also like to see more activity on the mailing list to give feedback

kodonog: shorter documents would be a good way to improve feedback

tara: let's not Greg/CDT taking the lead on the document become the only person doing the work

<twhalen> npdoty: incognito mode/private browsing mode keeps coming up as an issue

don't have a standard view of private browsing modes, which makes it difficult

<twhalen> (Item 5 in the longer Privacy Questionnaire)

mnot: talked about that since last TPAC
... agree that it could be useful
... but would probably be, or at least start, on the more minimal side
... to have it defined and to make it easy for other specs to refer to it
... could have more aggressive views regarding what it could do
... trying to work on a vocabulary for those different kinds of attackers
... currently would just be the local, subsequent attacker

<Zakim> npdoty, you wanted to comment on cookie jars

<twhalen> dsinger had talked about limited contexts/scope

mnot: relatively well-behaved web servers vs. more aggressive fingerprinting-like alternatives

<twhalen> ...which is basically like separate cookie jars, but that isn't the case for Safari, apparently

npdoty: don't all such modes include separate cookie jars?

mnot: not sure Safari does that
... reasoning would be that even the local attacker can access information about the user on servers if the cookies are maintained

rigo: isolate identifiers for a single session, and throw everything away at the end of the session

mnot: so many different pieces of state that are kept, and different browsers have different ideas about what state to keep and which to clear
... will try to share paper

<keiji> http://crypto.stanford.edu/~dabo/pubs/papers/privatebrowsing.pdf

keiji: a famous paper from Stanford 2010

<keiji> 2012 http://techlogon.com/2012/05/10/comparison-of-private-browsing-in-ie-chrome-and-firefox/

npdoty: would be useful to find or gather more recent work

mnot: get the impression that Safari philosophy hasn't changed

rigo: private browsing modes could consider sandboxing of certain functionality

<twhalen> 12:30 is lunch. :-)

npdoty: webappsec is looking at confinement (COWL) and containers

mnot: since we have access to browser vendors, it would be good to keep an up-to-date list of what's being done

npdoty: like a test-suite sort of page

<rigo> https://www.strews.eu/images/StrewsWebSecurityArchitecture.pdf

<twhalen> npdoty: include secure contexts in this questionnaire?

<rigo> is the security architecture consideration

<twhalen> npdoty: include Security Interest Group in this discussion?

colin: is this document intended to be just Privacy?
... this is the only group looking specifically at privacy, where there are others that are looking at security

npdoty: but we don't have a separate active group that would be looking at security reviews of specs, distinct from us doing privacy reviews

<twhalen> npdoty: Item 9 -- issues around user awareness becoming significant, especially for things like Geofencing and Service Workers -- in the background

<twhalen> npdoty: revocation came up in WebRTC yesterday and expected in Permissions API discussions as well

<twhalen> npdoty: WebRTC would like this issue to be decided uniformly and not just one-by-one

drogersuk: glad that Permissions has involvement from apf
... but current example is the most straightforward (camera), when it can be more complicated
... if we have to reply on prompting, suffers from the "dancing pigs problem"
... that people will jump over any hurdle for something sufficiently attractive, like dancing pigs

https://en.wikipedia.org/wiki/Dancing_pigs

drogersuk: a real problem in Android, which emphasizes the importance of revocation
... and a granularity problem of providing too much information to the user
... currently expect the Permissions API is not yet mature enough to be useful

<twhalen> Need to look at granularity (not currently spelled out in the questionnaire).

<drogersuk> to be clear - Android doesn't emphasise the importance of revocation but it demonstrates the problem!

tara: got a couple of things discussed on this list

[adjourned for lunch until 1:30]

tara: will discuss items in different specs and questions from other groups this week

<twhalen> restarting shortly...

<kodonog> scribenick: kodonog

twhalen: Getting started again...

Secure Contexts

<npdoty> discussed in webappsec yesterday: http://www.w3.org/2015/10/28-webappsec-minutes#item04

twhalen: PING review requested for "Risks associated with non-secure contexts"
... what is a user's understanding of what is happening with their data. (see item 6 of section 4.3)

ndoty: API that frustrates the ability to mitigate future access to data {FULLSCREEN}
... one thing that might be useful is that we use the same definitions of threats
... for example passive and active network attacker

colin: some groups have identified additional types of attackers

ndoty: topic on 4.2. Ancestral Risk
... I got an action out of yesterdays webappsec meeting to provide the limitations of the threat model
... we should make it consistent in what specs do with secure context (from Mike in yesterday's meeting)

nick: how do we indicate to the developer that it isn't working

ndoty: on the one hand it shouldn't be there, and on the other hand it should throw an exception
... how do browsers do this today?

clear site data

http://w3c.github.io/webappsec-clear-site-data/

twhalen: also discussed in webappsec yesterday

use case is a way for the website to recover

ndoty: there are a few different use cases and it isn't entirely clear which one they are trying to cover
... there is a separate proposal (use case) the site thinks the user might be doing something sensitive or vulnerable (domestic violence site), the site might want to suggest to the user that he/she clear the info

rigo: if you take the context out of the browser without the user knowing it can have negative consequences

mek: sites can do that anyway
... with this api the sites can clear the data atomically

rigo: here it is all or nothing

ndoty: when you are recovering from a breach then all or nothing is appropriate

rigo: it depends, you may not

ndoty: use case i was thinking of... if a doctor is a reading email, you want to clear all the information in the browser when he/she logs out

mek: we've had sites ask for that

ndoty: I want to clear the names of the pages that I went to
... a malicious site could remove itself from your history

rigo: what I am struggling with is they (the sites) know more than I know

ndoty: to be clear, this is suggesting that a site should be able to clear information that it stores but not general history

twhalen: privacy considerations in the document
... 4.1 web developers control the timing
... 4.2 remnants of data on disk

whalen: are they missing any pieces of the privacy considerations that really ought to be in this document?

rigo: unexpected deletions may cause a "creepy" user experience

twhalen: ndoty: I can see 2 things.
... first, when we talk about cookies and local storage... if they add anything that could be stateful they should clear it at the same time
... things that are outside the origin's control might still provide state
... second, permissions topic
... when do permissions get cleared

rigo: now worried about it looking spooky to the user

<npdoty> kodonog: are we looking at privacy considerations or usability considerations?

<npdoty> ... not all things that are "spooky" are privacy issues

<npdoty> rigo: spookiness might effect whether you want to use a tool again

<npdoty> tara: there will be some overlap, but some that are just out of scope

private browsing mode

<rigo> François

francois: original use case came from someone who was working on a suicide help line
... he didn't want people to be able to find history
... some people who are visiting the website aren't necessarily seeking help, doesn't make sense to clear state
... plus getting to the site generally means a long trail in advance and clearing the site itself doesn't really help
... this feature would clear all of the cookies for the last 5 minutes (for example), would need user consent
... unlike the previous feature we were discussing that would clear the information associated with a particular site
... this would be exposing an existing browser feature

ndoty: I'm curious about the 5 minute thing, i understand the clearing the trail thing, but how do you know that 5 minutes is the right amount of time

François: we don't know it is the right amount of time, potentially set by the set or configurable by the user

scribe: clearing too much might be suspicious

ndoty: I could imagine doing this search in one tab, and doing something else in another tab
... what I really want is to clear all the steps that got me to the sensitive site
... not to add to your work, but what if you just calculated a tree

dsinger: unanswerable to know how far back you have to go

twhalen: this one has more risk because it clears history versus just clearing for a site

François: the woman that we spoke to indicated that some of these husbands install spyware and this won't help that

scribe: also if you are logged into google or something like that, even if the history is cleared from the local computer, ads and such may result on those sites

ndoty: did you consider ways to solve the logged in problem
... could be a page that indicates clear info related to this at google
... don't want that on your browsing history

François: can't solve all the problems,

dsinger: cleaning search results is harder than not collecting it to start with

François:: our impression from visiting some of these sites is that they are trying to provide guidance on how to do this

scribe: we are trying to help automate some of it

ndoty: differences between parents keeping an eye on their children have a different model than an abusive spouse

rigo: helpcopter parents and jealous spouses are essentially the same from a means perspective

twhalen: appreciate that you have grounded this in a very specific use case
... I hope you are going to get enough feedback to make this useful
... is there something in particular that you want, what is most helpful, timeline?

François: we are going to do more user research with this, prototype it and take it to some set of users

scribe: if people think of other things that wouldn't get cleared that might be in scope please identify them
... for example, might be a separate API for related google
... searches
... We would like to ship this in Firefox, but there will be more value if we can trigger the clearing in every browser

https://wiki.mozilla.org/Security/Automatic_Private_Browsing_Upgrades

François: user has to click the "yes do this" button, but the user has to also iniate the action

ndoty" we are chartered to prototype, but we aren't chartered to do Recommendation Track

dsinger: I can see us drafting a pre-spec that talks about this
... this spec could include clear the path to where you are now,
... making sites aware in advance that you are trying to be private
... more uniformity on what private browsing mode means so users don't get confused missed third item
... There is critical mass here to address private browsing in more detail

<npdoty> consistency/documentation of what private browsing modes mean in different browsers

dsinger: update on private browsing stuff, posting a message to the list a while back.
... suggested using a unique identifier to indicate to the site that you are in private browsing more,
... correct pushback is that you are adding fingerprinting just when a user trying to be private
... further realized that I don't need a unique identifier it just needs to be a small integer
... if the user presents as anonymous you won't even try to identify them
... we can at least go down to a small integer

<rigo> David Singer talks about DNT:2

dsinger: can even get to a single bit, if the site sees the bit on it can retrieve the cookie
... plausible to indicate i'm trying to be private now and have some <???>

rigo: may be able to leverage the infrastructure created for DNT for a different purpose

dsinger: i think this is worth working on. The christmas present problem is becoming acute.

rigo: the solution is AdBlock+

dsinger: I would want a handful of significant site saying they would be willing to implement.

<npdoty> https://wiki.mozilla.org/Security/Contextual_Identity_Project/Containers

François: this is mixing two things: private browsing sends a signal to be private, a site that is private browsing is isolated from other sites

<rigo> presents "containers"

scribe: Containers implemented in Firefox, hasn't been exposed to users yet
... not leaking information from that container

dsinger: seems we are both working on similar things so maybe it is time to do something
... it is a two way agreement, user indicates to the site that they wish to be private, and site indicates to user that it is respecting that wish

<twhalen> breaking until 15:15

Service Workers and related issues

<Mek> https://slightlyoff.github.io/BackgroundSync/spec/#privacy-considerations

<twhalen> mek: overview of Web Background Spec API

<twhalen> Give websites a way to more reliably synch data back to their server

<npdoty> scribenick: twhalen

Page might want to save something...but you closed things down too fast and lost it

Privacy Considerations: Location Tracking; History Leaking

npdoty: problems here (potentially) are: idea of making requests when you've closed a tab or moved on, or might be surprising to users that you're still communicating with someone after you've, say, moved elsewhere
... similar API -- Web Beacon Spec
... Beacon is from web perf group; want to send telemetry data back to a site before you navigate away

block on load -- prevent page from closing until it's done sending data

If that request failed, might want to retry it.. (Similar issues(

http://www.w3.org/TR/beacon/

mkwest: working on related item
... "one-stop shop" for reporting mechanism. Good time for feedback to get in - is new (started two weeks ago!)

<npdoty> https://mikewest.github.io/error-reporting/

mkwst - may work on this in web perf group, but feel free to file bugs against repo if you have feedback

idea is to roll together all the various reporting methods (feature-specific reports)

Beacon is more flexible than other mechanisms; unclear how they will fit together

npdoty: reporting always going to have this problem: asynchronous; also repeated to ensure data is saved
... other things in the background to talk about? Background sync?

mkwst: other things we want to do with reporting mech: public key pinning; network error logging; perf reports; likely others...
... Background sync seems like different use case

npdoty: but has similar properties - want you to keep retrying this thing after tab is closed

<npdoty> https://w3c.github.io/push-api/

Push Messaging

mek: with Geofencing it's obvious that you are being background tracked; not so evident with these other features (with no UI)

npdoty: showing slides - Service Work Accountability

"What you can't see can't possibly hurt you" was the old model

But...see Service Worker-based APis

Push; Background Sync; Geofencig

"Bad Options" - do nothing [about mitigations]; maintain visibility [e.g., user agent makes it visible]; task manager [list of what's in the background]' quota; or some combo of the above

Q: is there a better option?

Firefox push experiment (time-based quota)

mkwst: tainting solution (COWL) available

rigo: service worker is there in case I go away but can be abused. Users may wish to take action as to when they push local data to the cloud.

mkwst: or you can go offlline - keeps Service Worker from talking to the server
... there is network permission associated with an app; bit of a sledgehammer approach
... threat model is...?

rigo: I have an application (Google Docs); I am Edward Snowden, don't want Google to know what I am writing down in this doc; want to control when it is pushed to the server

dsinger: I like the idea that Service Worker is a part of server's activity - much like a cookie

mkwst: service worker executing locally -- acting more like an application. Geolocation ability associated with the website - Google docs, say, needs to find location-- that state associated with local application.
... has access to network and can broadcast this...but it's locally-running code.

npdoty: distinction is that these are operations that are not tied to the *window*; Service Worker's particular scenario -- has network access *and* not tied to visible page

rigo: was saying that maybe we want to be able to tell Service Worker to stay "offline."

mkwst: application can make policy decisions that you are talking about, not the browser. Would have to audit the code.

npdoty: one way to do this (as in Chrome) for Push message: ensure Service Worker can only access it when there's a notification present
... what if restriction was "only need visibility if you have outgoing network activity"?

mkwst: would just store and send later
... as long as the info that you want to protect is not available to JS, then that's a reasonable model. Otherwise...

npdoty: just wondering if there were a class of things for which you want to cache reports until you were "visible" again.

<npdoty> http://privacypatterns.org/patterns/Asynchronous-notice

mkwst: can add extensions if necessary (for power users)

npdoty: look at pattern - async notice
... one-time consent has issue that maybe you didn't consent and you don't find out (e.g. shared machine)

Agent gives notice to tell you about what the current status is (after the fact)

"Fire Eagle My Alerts"

npdoty: for repeated request: can see that one site got sent 200 requests versus usual five - were you expecting that?
... might be improvement over Task Manager model (which may not be checked)

<npdoty> rigo: specific proposals @@@ on JavaScript sandboxing

<rigo> -> send to webappsec

mkwst: checking in about the earlier specs we discussed today that he has worked on..any qs?

npdoty: permissions -- do they get cleared on Clear Site Data request?

mkwst: in current proposal, no

we can talk about it

npdoty: what is the effect on evercookies?

mkwst - if associated w/server-side data, then we can't do anything about that

mkwst: proposal does "clear site data" and "clear site cookies", basically

site can clear DOM storage; cache; cookies [bit complicated]; prevent recreation of that information via JS context

npdoty: seems like if you cleared all those at once, then should get you back to "pristine state" except things granted to origin by the user

rigo: says does not clear evercookie according to research; will provide URL

<rigo> ACTION: rigo to send URI for evercookie to mike west [recorded in http://www.w3.org/2015/10/29-privacy-minutes.html#action01]

<trackbot> Created ACTION-12 - Send uri for evercookie to mike west [on Rigo Wenning - due 2015-11-06].

mkwst: use case of this is to protect a *site*, not a user. The browser UI is for that purpose.

<rigo> http://samy.pl/evercookie/

dsinger: idea that came up about thinking of user case where website wants, say, health care data

how to share data with sites

dsinger: don't want to have individual popups (drowning in notices)

also don't want broad consent (too scary)

use digitally-signed file: list of things I need (e.g., heart rate), and nothing else; and I promise to agree to <privacy regulation X>; good for <date range>

then browser can mediate the interaction

can tell what happened to data on what site -- can have "promise violation" if site shares data beyond agreement

mkwst: initial statement was users don't read privacy policy, so...

dsinger: instead, some other group (e.g., from EFF) can list acceptable policies

drogersuk: we discussed this in the DAP group; don't agree, though -- user's context can change a lot during the day.

<rigo> medical data is just so sensitive and hard to deal with. It adds a huge burden to the tooling

drogersuk - do we allow user to readily revoke?

dsinger - yes. trying to strengthen agreement.

drogersuk: to me the solution may be more about negotiation process; user can have some control (granularity, etc)

dsinger - not getting into how long the consent is good for; is up to the user

npdoty: I think you are describing different ideas of "revoke".
... what if you gave away the data and then the policy changes?

<npdoty> distinction between revocation of a persisted permission on the browser vs. contacting the data controller to remove data

dsinger: gives incentive to create uniform privacy policies

drogersuk: when is this exposed to the user? When they first turn on the device?

dsinger: was not envisioning machine-readable policy as necessary

npdoty: simple version is "list of URIs"

dsinger: basically, looking at the idea of "when am I willing to share data with a particular site [under what circumstances}, and what data"
... should I develop this further?

mkwst: privacy policies aside, it's good to have a mechanism by which a site can make verifiable assertions seems interesting.
... Unclear if tying it to permissions is the way to go...but is basically useful idea

Other use cases? Promises.

<npdoty> warranties, or contract promises between companies

mkwst: don't want ot get into linked date, but creating a mechanism for assertions is interesting.

<npdoty> [adjourned]

<npdoty> tara: thanks all for coming

- DRAFT -

Privacy Interest Group Teleconference

29 Oct 2015

Attendees

Contents