W3C

- DRAFT -

Improve definition of parties and trust relationships across W3C

27 Oct 2020

Attendees

Present
wseltzer, AramZS, jeff
Regrets
Chair
jrosewell
Scribe
wseltzer

Contents


jrosewell: Tess has prepared a paper, and Joshua has some ideas to share

<AramZS> Is the paper linkable?

<jrosewell> https://tess.oconnor.cx/2020/10/parties

<hober> https://tess.oconnor.cx/2020/10/parties

jrosewell: Tess, could you start with an introduction to your paper, then Joshua, then discussion

hober: there's been some discussion of these terms and how we use them in the standards space
... James, you suggested we needed to define "party" terms
... and I think instead, we should stop using those terms in specs, and use different terms
... this is my personal blog, thinking about how these terms are used in web standards
... how usage relates to etymology, inadequacy for spec work
... and alternatives
... Contract law: a contract has two parties, a first and second party
... a third party is one who isn't "party" to the contract, though they may be affected by it
... in colloquial usage in the browser space, browsers are rectangles you look at
... with a box at the top (location bar, omnibox)
... the terms first and second party are the site they're on and themselves
... third party is everyone else
... browser is user-agent, acting on behalf of user
... the problem of trying to bake these terms into specs is that they're handwavy and approximate
... and laden with meaning from other areas
... devil is in the details, where people make different assumptions
... spec writing tries to use terms so exactly that people can interoperate

<AramZS> already surprised to find that I had a different definition of 2nd party

hober: these terms get used in other realms
... law, policy
... fine for casual use
... in specs, we need something connected to how the tech works
... and tech uses origins and sites
... an origin is a scheme:host:port tuple
... things relating to security on the web need to relate to origins

https://tess.oconnor.cx/2020/10/parties#origins

hober: for legacy reasons, we also have reference to site, which has some relation to registrable domain or ETLD+1
... notably, cookies
... so it's often effectively a storage boundary
... what we're often talking about when we talk about parties is often the relationship of an iframe to top frame
... eg "this iframe relative to the top frame is same-site or cross-site"
... blog post suggests we add definitions of same-site, cross-site, same-origin, cross-origin for environments
... then we can use those in developing new privacy and security features
... for consistency and completeness in specifying details
... Fundamentally, the goal of specification is interop; being exact helps us achieve interop; these definitions can help us be exact
... TLDR, I don't think we should use first-party and third-party in specs

jrosewell: a question, to do with public suffix
... do you mind explaining?

hober: after generic top-level domains, such as .com, .gov, .gay
... there are cctlds, .us, .uk
... up to national bodies to determine how country-code names are allocated
... some countries mirrored the generic hierarchy or added other levels, e.g. .com.mx
... so accounts.mycompany.com.mx, origin is accounts....
... site is mycompany.com.mx
... but you shouldn't be able to set cookies for .com.mx, every commercial site in Mexico
... so browser has an idea of registrable domain, what's the highest level domain an ordinary entity can control

Public Suffix List

hober: it's a hard-coded list, not algorithmic, rooted in policies that change over time
... Ryan Sleevi wrote a great rant about how bad PSL is

https://github.com/sleevi/psl-problems

hober: but it exists because it has to

AramZS: very useful, because your definition of second parties isn't mine
... mine was the groups that trade user data behind the scenes

jrosewell: Joshua, your thoughts

<AramZS> Agreed!

joshua_koran: for those who haven't read hober's post, it's great
... this session was about creating trust relationships, how people can create better understnaing of where they're placing trust
... review 3 models of how people interact with properties
... a browser defines a first-party as the site in teh address bar
... but that doesn't work for smart speakers or mobile apps
... we'd likely agree that usage doesn't match legal definitions
... Black's Law Dictionary:
... party is a legal person named in the transaction
... third party is one not named in the doc, but may have rights
... beneficiary
... relating to contracts, contractual privity
... wikipedia on privity of contract.
... when mfr sells through distrib to consumer, there's no privity between mfr and consumer
... so online, ...
... I propose legal definitions are useful for thinking about how trust is established on the web
... Privacy law. Neither GDPR nor CCPA use "first party", rather "data controller"
... who may use "data processors" to process data
... if we look at terms origin and site, is it easier for people to understand these new terms than party or controller?
... why we need terms, people want to trust the orgs they interact with, trust they won't be harmed
... e.g. if I use airbnb, I trust place will be cleaned, without knowing who's the cleaning service
... Believe we're aiming at 3 goals:
... improve clarity for normal people
... protect people, produce audit trails
... when we make these changes, they shouldn't preference independent publishers who must rely on vendors
... as we improve people's interaction, focus on what makes it easier to understand the value exchange and let publishers choose which vendors they work with

jrosewell: you mentioned data controller and prcoessor as tech terms

joshua_koran: GDPR "controller" is responsible for the collection and processing of data, even if they outsource the processing

<dmarti> +q

jeff: neither a spec-writer nor a lawyer
... thanks for the presentations
... now we have even more terms!
... and I'm confused
... a possible contribution of the IG could be to discuss business cases and map across language
... what it means for each of the audiences

dmarti: in California, under CCPA, there's a similar concept of service provider, roughly equivalent to GDPR processor
... how reasonable is it to assume that domain registration rights and data stewardship obligations overlap?
... e.g. if localization requires data to be segregated in-country, don't use domain to make assumption about data-sharing

hober: those are different domains
... mytrademark.com.mx and mytrademark.co.uk are different domains/sites
... First Party Sets proposal might allow someone to say those domains should be grouped; that's just a proposal at this point.

joshua_koran: I don't think it's reasonable for peoople to have to understand corporate ownership
... lots of complexity
... and also from business perspective, not reasonable to make them nest all their operations under a single domain if they want to operate globally

<Zakim> kleber_, you wanted to ask what limits on letting "publishers choose which vendors they work with" Josh is worried about

kleber_: to joshua_koran, a key threat you highlighted was limits on letting "publishers choose which vendors they work with"
... what limits does this imply, or are you worried about in the future?

joshua_koran: many smaller publishers must rely on many other vendors to oeprate their businesses
... we want to give advertisers equal choice over their vendors
... to make those networks interoperate, we need some common
... re privacy, we can distinguish between directly identifiable information and pseudonymous id to give what's needed for advertising, budgeting
... if we interfere wtih ad supply chain, we interfere with pbulisher monetization
... which in turn interferes with what's available to people

kleber_: in a world where nobody can recognize the same person across domains or sites,
... where that's not a capability of the web, I don't understand how this influences whom in a supply chain you work with
... if the privacy model fo the web lacks this capability

joshua_koran: we probably do want to allow people to browse across publishers without having to disclose their identities
... but if there's a minimum currency of what publishers need to disclose to be viable
... competitive disparity between large publisher, lots of existing people and interactions; a smaller publisher or startup won't have the same scale
... how do we enable them to offer something that's comparable, to advertisers

Christine_Runnegar: thanks Tess for writing up your terminology
... I'm less interested in use cases, than in making sure user privacy gets the best possible protection on the web
... what we need to do in W3C community is use language that makes sense in tech specs, with enough precision
... problem with legal defintions is there are many different cultures and different laws
... I care about the protections we put in place, that we can be precise enough

AramZS: agree with kleber, we're not excluding small publishers if making changes across the board
... vendors have to comply with the same rules everywhere
... while I'm glad to hear legal feedback, standards-writing process needs to spread across multiple domains and jurisdictions
... beyond US, Europe
... as we are considering how we use terms, agree with joshua_koran we need to consider cases that don't have browser with URL-bar
... thanks for raising that
... applying same-site, same-origin to IOT

joshua_koran: agree we're thinking about people's privacy first and foremost
... we're not often enough thinking, e.g. that an orgin can harm them equally
... most advertising today is trying to match content to people
... if a small publisher has less data, less feedback, their matching will be worse than for a large publisher
... so the value they can offer advertiser will be worse than that offered by largest publishers
... then advertisers may want to shift more of their budget to large publishers
... concerned that we're inadvertently setting up a system disadvantageous to small publisehrs

AramZS: are you arguing user tracking is necessary to make small publisehrs competitive?

joshua_koran: per-user-agent pseudonymous tracking, yes

AramZS: I disagree
... in the current setup, that's what advertising relies upon
... but if it goes away, then don't think that will continue
... small publishers existed before the Internet, contextual advertising
... I'm not sure user targeting of any sort has advantaged all small publishers, either

<jrosewell> From Zoom chat Christine : A write-up about value of contextual advertising (NL publisher) - https://brave.com/npo/

AramZS: it has led to bad decison-making, content chasing user-derived statistics
... we may see business models changing and the mechanisms by which users interact with and trust small publishers improving

joshua_koran: separate targeting from optimization

pbannist: think we're talking about 2 things: tech and specification issues around naming
... hober's names are great
... in more colloquial usage, many people think "first party good, third party bad" and that's not quite right
... when we talk about big v small publishers, it's often walled gardens vs everyone else
... bc walled gardens are vertically integrated, they can run things themselves, without 3d parties
... so concern re gap of capabilities between small and large

hober: was going to ask joshua_koran about minimum bar of pseudonuymous identifier. that's below the minimum bar for privacy on te web

kleber_: the whole point of Google privacy sandbox efforts and new APIs chrome ahs proposed is exactly to meet the closing the loop and understanding advertising and uiser targeting goals without requiring pseudonymous identifier

<AramZS> Perhaps if we're going to discuss this we should define publishers in the future, because I'm not sure everyone would agree with you Paul, though I see your point.

kleber_: or per-user tracking

jrosewell: thank you all very much!

<AramZS> Thank you to all yes!

<kleber_> Thanks @wseltzer

Summary of Action Items

Summary of Resolutions

[End of minutes]

Minutes manually created (not a transcript), formatted by David Booth's scribe.perl version (CVS log)
$Date: 2020/10/27 15:01:09 $

Scribe.perl diagnostic output

[Delete this section before finalizing the minutes.]
This is scribe.perl Revision of Date 
Check for newer version at http://dev.w3.org/cvsweb/~checkout~/2002/scribe/

Guessing input format: Irssi_ISO8601_Log_Text_Format (score 1.00)

Succeeded: s/after/after generic/
Present: wseltzer AramZS jeff
No ScribeNick specified.  Guessing ScribeNick: wseltzer
Inferring Scribes: wseltzer

WARNING: No "Topic:" lines found.


WARNING: No date found!  Assuming today.  (Hint: Specify
the W3C IRC log URL, and the date will be determined from that.)
Or specify the date like this:
<dbooth> Date: 12 Sep 2002

People with action items: 

WARNING: No "Topic: ..." lines found!  
Resulting HTML may have an empty (invalid) <ol>...</ol>.

Explanation: "Topic: ..." lines are used to indicate the start of 
new discussion topics or agenda items, such as:
<dbooth> Topic: Review of Amy's report


WARNING: IRC log location not specified!  (You can ignore this 
warning if you do not want the generated minutes to contain 
a link to the original IRC log.)


[End of scribe.perl diagnostic output]