Improving Web Advertising BG -- 21 Jul 2020

<wseltzer> present=

<scribe> scribe: Karen

Agenda-curation, introductions

Wendy: Let's get started with agenda curation and introductions
... we have continuation of the TURTLEDOVE/SPARROW proposals
... we will continue with the Magnite presentation
... and invite people to pick up items from the dashboard, the various proposals we have seen
... and issues in the repositories that would benefit from discussion here
... and any other business
... See if anyone wants to report on the SPARROW reporting from last Thursday
... and any other business that people would like to suggest for the agenda?
... Any introductions; new participants?

Daniel

David Dabbs, Epsilon/Conversant

@ from Booking.com

Mike Lerra, Google

@ Facebook

<kleber> Welcome back ajknox!

Clearcourt

2 TURTLEDOVE/SPARROW proposals: Gatekeeper and Proprietary

Wendy: Last time we heard proposals around Gatekeeper and proprietary cohorts from team at Magnite
... We did not get to questions or conversations

Tom: should we do a brief summary of the proposal?

Wendy: yes, give a high-level pointer

Tom: two proposals

<wseltzer> https://github.com/MagniteEngineering/ProprietaryCohorts/

Tom: one is proprietary cohorts; browser shares info with any entity and create cohorts

<wseltzer> https://github.com/MagniteEngineering/Gatekeeper

Tom: an identifier past in prop. cohorts case
... any entity can provide cohort assembly
... second is cohort gatekeeper
... deconstructs the browser functions
... interbrowser comms can take place in a secure way
... those are the two proposals
... provide a framework for TD and SPARROW proposals
... for cohort creation
... pause there
... I think there questions on those
... thanks

Michael Kleber: Happy to let others talk

Wendell: thanks for the proposal about the refactoring of our industry
... I think it is good to have these discussions and flesh out the discussions
... I have a question
... I may have the answer myself
... Name of proposal is Gatekeeper
... Tom, I think you mentioned an ID that would be passed from browser to the gatekeeper
... could you say more about the intent of the gatekeeper and the ID
... "the" there is only one
... gatekeeper is a giant database of people
... and ID is a unique ID
... if there is more there, or I missed something
... and how does it operate?
... a church, a non-profit, a for-profit, or like US govt

Tom: Starting point is between browsers
... definition, very precise one needed on how browsers will create inter-browser data
... less dependency on the browsers for cohort assembly
... we define how to communicate and externalize how to @
... is it gov't, church, bar
... it has to be flushed out and agree

scribe: many examples of governance
... in order to pass an ID there must be some oversight structure
... an area that needs to be defined
... not just any entity
... not any entity can say it's a gatekeeper
... governance inside protocoal

s/protocol

scribe: meant to be something with a governance structure
... controlled by all the participants in this space

Wendell: there are a few ways to do that
... gov't
... movie industry does this
... have ideas about playlists
... that are controlled by their IP
... how does that control work?
... police force; patents and IP, take a license to be able to use this thing

Tom: We are trying to not dictate that; many ways it can be done
... many cases of self-governance works fine
... that it only is in controlled environment, would hinder innovation
... publish the interface; replicate and develop a governance structure

Valentino: this is similar to the question just asked
... I can change it
... it is valid for SPARROW proposal
... we should start to talk about who operates all these non-profit or third-parties
... who do multi-party aggregation service
... who runs this gatekeeper
... everyone here is familiar with the scale of the advertising business
... to run this with 5K companies seems expensive and hard to store all that data

Tom: yes, that needs to be discussed
... there is a gatekeeper in TURTLEDOVE
... the function will be performed; can it be decoupled
... who does it
... why proprietary cohort has advantages
... you have a temporary identifier path
... needs to be governed by the browser protocol

Michael: the nature of the gatekeeper is worth talking about
... gatekeeper name is borrowed from SPARROW
... has some server side component, trusted to work in a certain way
... I think calling this a gatekeeper also
... hides the fact that proposed gatekeeper is very different from other server side
... gatekeeper that @ proposed
... did not keep record about indiv people
... and gatekeeper aggregated stuff in both Chrome and SPARROW
... proposal, collect aggregate data over time
... this gatekeeper is to build up profile of individuals over time
... that is a much more invasive job
... as part of an effort to increase privacy
... This one sticks out as much less private
... in terms of kind of data and nature of what is being collected over time
... more than what we have talked about

Tom: That is not the intention
... only store minimal info necessary to create cohorts
... have gatekeeper have same id as browser
... browser sharing info; that same info should be avail to gatekeeper
... by no means necessary to store long-term data

Michael: browser knows all my browser history
... and any server would learn way more than I think it should

Tom: We know it's self contained in browser
... also know there is peer to peer capability
... also comms outlined in TD proposal
... get clarity on what info is shared
... if not a master ID file for world
... governing entitites
... not the intention

<btsavage> I'm confused... what "peer to peer" protocol is being referenced?

Tom: provide some level of cohort creation outside of browser
... team may have to do more to be consistent with privacy proposals

charlieharrison: first point
... complexity and role of server side infrastructure
... some concerns about it storing stuff
... this question really informed our design of the aggregated measurement
... aggregation is stateless
... all the adtech state is held on adtech servers
... the helper servers are not collecting data from browsers
... you are getting reports only readable by aggregation service
... and not tolerate high QPS
... batch inquiries in a stateless way
... technical designs to make running these servers easier
... talking about the aggregation doc
... may be dealing with complex queries
... might be solutions here
... Other point to make
... in terms of trust and the helper servers or gatekeepers
... the technique we used is to distribute trust across multiple parties
... only trust one of parties across gatekeeper system
... get pretty good guarantees
... about privacy
... we pointed out some of these problems; might be applicable to a SPARROW like model
... happy to brainstorm these ideas more

James: a couple quick questions
... do any of proposals, TD or SPARROW, others, allow access to the browser history?

@: question for Michael and TURTLEDOVE
... I think it's yes for purposes of cohorts

James: so question is which parties have access to browser history
... a narrower or wider group of people
... make sure that is clear for the record
... also discussion of privacy principles
... I am not aware that we have documented and agreed to those
... if there is another document, could you post into irc
... I would be keen to see that

@: Sharif, could you please mute

btsavage: there is a reference to peer to peer protocols

<wseltzer> s/

btsavage: I don't understand what you are referencing in TD

Tom: My assumption is interbrowser comms

@: web site asks my browser to store

<wbaker> I would have said the Privacy Principles were from https://github.com/michaelkleber/privacy-model, which we had from the 2019-09 f2f meeting in Cambridge MA

@: info about my belonging to a cohort or interest group
... then my browser makes requests
... member of this browser and cohort
... why do you make this assumption?

Tom: I don't think we have reached consensus
... cohort assumption

<joshua_koran> How can a cohort assembly process ensure sufficient number of browsers in each one without such sharing?

Tom: has to be comms horizontally between browser or vertically with servers

Michael: I agree
... I don't understand
... two different proposals from Chrome
... one is about FLOCS, one is about IGs
... servers, show ads about something in future
... there is no comms between browsers, no peer to peer
... just a server sees you on a page about whatever, and show you ads about whatever in future
... direct comms between browser and web page; script running on web page
... only way there might be a server infrastructure
... is whether we need to make sure if group of people interested; that IG is some threshold size to avoid individual targeting
... building of groups does not involve what you are talking about

Tom: that needs to be more explicitly stated
... there are some use cases where it seems to be implied
... we are inflating the FLOC work with @ work
... if that is incorrect, that should be stated

Michael: Two proposals are separate
... if I can make it clearer that TD
... that there are no identifiers
... a lot of work that there is no way to tie it to an individual

Tom: That is understood
... the proposals are independent
... but both are about how cohorts are assembled on the web
... would be consistent with each other
... that is an assumption

Michael: FLOC is a different beast
... different assumptions about how cohorts are put together
... better to be about how cohorts are put together
... FLOC that you recognize being part of by the browser
... that is a piece of info shared by you and lots of other people
... browser puts a lot of effort into not have specific info about you
... but info joined up with contextual info about what page you are on
... ad request for URL
... of page you are on
... and ad request
... and also says what FLOC request
... and with others with similar browsing history
... TD is extremely different
... completely walled off browser history
... no way to join with any other notion of who you are
... does not get joined up with first part identifier or the page you are on
... the trade-off
... the ad serving and rendering and reporting is much more complicated
... by keeping it separate
... why TD is such a complicated proposal
... a necessary consequence of this wall between two types of info
... more freedom of how we put people into an IG
... does not lead out into rest of their browsing activity
... these proposals talk about like a FLOC
... joins up with other info about you
... from privacy perspective, seems more dangerous
... TD, all you can do is advertise to it; would be a much safer one

Wendy: from that back and forth; more clarifications
... maybe there is opportunity to put clarifying text; pull requests welcome

Gang: You explained gatekeeper....browsing history
... and gatekeeper decides who forms a cohort
... how long does the gatekeeper need to save the browsing history?

Tom: Assumption is hours but that has to be defined

Gang: hours, I see

James: Fascinating that we now have more proposals coming forward; TURTLDOVE, FLOC, SPARROW
... if these all go forward to prototyping
... thank you to Michael for the link in irc
... good to take that forward with success criteria doc

<kleber> Hah, thanks Wendall for posting my link, I was too busy talking :-)

James: we need a method of agreeing what the privacy proposals; otherwise we create a large amount of work
... would like to put success criteria on next week's agenda
... and help to address these issues

Ben: So to the point made earlier
... the Chrome team's design

<btsavage> https://www.google.com/chrome/privacy/whitepaper.html

Ben: that I think Charlie made about how to minimize trust, what is leaking

<btsavage> "If “Include history from Chrome and other apps in your Web & App Activity” is checked on the Web & App Activity controls page, Google also uses your synchronized browsing data to provide personalized Google products and services to you."

Ben: link in irc for Google Chrome privacy white paper

[reads quote]

<kleber> @jrosewell Note that other browsers have published privacy policies also: Mozilla: https://wiki.mozilla.org/Security/Anti_tracking_policy WebKit: https://webkit.org/tracking-prevention-policy/

scribe: Earlier on that same page

<btsavage> "When you’re signed-in and have enabled sync with your Google Account, your personal browsing data information is saved in your Google Account so you may access it when you sign in and sync to Chrome on other computers and devices. Synced data can include bookmarks, saved passwords, open tabs, browsing history, extensions, addresses, phone numbers, payment methods, and more. In advanced sync settings, you can choose which types of data to [CUT]

scribe: it says what types of things are synched

[reads quote]

scribe: So it sounds like it is not just the browser that has access to the full browsing history
... it is also synched to Google servers
... and can provide services; but does not say which services
... is that in conflict with privacy principles you outlined earlier?

Wendy: if anyone wants to answer Ben

Kris: I was going to say

<jrosewell> @kleber they have. I do not believe these have been adopted by the W3C.

Kris: obviously the browsers can make these joins between browsers and ads they are seeing
... all that data is in browser because it has to be
... I get that if more entities, that is problematic
... but there is an assumption that browsers are automatically trustworthy
... I would feel better if there was more transparency or an audit about what is happening
... where it could be pinged
... why a user gets a particular ad, or why being placed in a FLOC
... what appeals to me about gatekeeper is an org that could be audited
... browsers do have this type of thing
... share more explicitly
... or some way this it could be audited
... or this idea that no one can be trusted

Tom: That was the intent of the gatekeeper proposal

<joshua_koran> +1 to Kris' comment that if browsers can "appropriately" process data, we should abstract this to other entities that can also do such processing

Tom: that browser data should be transparent and published
... anything in browser should be democractized for audit and control
... that is exactly the intent

Wendy: Thanks
... I see queue is empty
... Lively discussion
... added the gatekeeper and proprietary cohorts to list of documents that populate the dashboard
... so issues raised there can also appear here

<wseltzer> https://w3c.github.io/web-advertising/dashboard/

Wendy: in the dashboard
... like people to scan through that
... and I see we have a large selection of proposals that have been discussed
... here and lots of active discussion on issues

@: Does anyone want to comment from Chrome team on my question of usage synch?

@@: this is a fairly existential question; cannot discuss today
... it is critical to the decision analysis

Kleber: What I can say
... go to Chrome: settings
... tells you about info is stored
... if you don't turn on synch, nothing turns on your device

<alextcone> +q

Kleber: if you turn on synch then info is saved on Google servers
... synch and Google services gives you a whole menu of ways the synched data might be used
... you have control over which ones you want Chrome to do
... and it ties into all activity
... and see everything associated to your Google account
... there are a lot of settings there that let people control this
... Ben, does that answer your question?

Ben: not entirely
... I am aware of those controls
... surprised when I logged in
... that I was signed into Chrome
... I do not remember signing into Chrome
... If I sign into gmail, that automatically logs me into Chrome
... surprised that I was logged into Chrome
... Two approaches
... adtech industry collected info and by default this was turned out
... if people can selectively opt out

<scribe> ...new Chrome proposals are a different vein

UNKNOWN_SPEAKER: rather than default opt out
... data is remaining anonymous
... storage access API
... perhaps a choice to opt-in
... but by default when you log in to store all information in case of opt-out

Wendy: We have some design choices and how they interact with proposals and potential standards

<jrosewell> https://www.gov.uk/cma-cases/online-platforms-and-digital-advertising-market-study

James: There is a report
... that talks a lot about these issues; we have talked about it previously
... I am happy to facilitate a separate meeting
... I think these issues are absolutely germaine to discussion of past 45 mintues

Wendy: Some of technical issues in that report
... as individual issues or comments on proposals, leading to new proposals seems potentially useful

bleparmentier: Want to say I am surprised by what Ben said
... we are all working with a lot of assumptions
... that user consent should not be global
... limited to targeted people
... hearing that Google will keep
... it
... I am extremely surprised
... I have been logged into Chrome without any ask
... not an issue for me
... some good things from logging into Chrome, but I was not asked about targeting
... and no idea that Chrome will keep individual targeting capabilities
... I am surprised that this is ok
... way worse than anything done on ad side
... page where you can select opt-out
... we are told this is not ok
... Really surprised
... if Google is only one to have targeting at user level
... everyone else will need to rely on it
... doing this all together
... same level playing field
... if on top of that, on open INternet
... if Google is only one to do cross-website
... that is an issue

Wendy: This maybe
... a different issue from where we started on the agenda
... we don't have anyone prepared to address this from the Google Chrome side
... Suggest we bring that topic back in a different meeting
... regarding how signin to Chrome works
... anyone free to raise as another agenda

<krischapman> I don't think the Chrome folks are suggesting that Google gets to keep identifying users, and everyone else doesn't

Alex from TechLab

Alex: Wanted to support what Ben was saying
... Opt-out has been thrown out a lot
... two core principals
... settings in Chrome; transparency and control
... I want to highlight the dicotomy
... of the principles behind what is being proposed in privacy sandbox
... doesn't have transparency or control
... unless we get an answer, seems like it will continue
... as something Google shares back
... whether opt-in or opt-out
... we are talking about control based on transparency

Kanishk: I think Alex covered my points

.,..If Chrome team is not ready to answer

scribe: I would like to have this as a topic
... critical conversations; more fundamental in my mind
... want to continue this conversation

<jrosewell> alextcone: I agree control is very important, and needs to be resolved before any of these proposals are advanced.

Wendy: Anyone should send as an agenda request
... or add issues into the GitHub repository so that we can get the right people to discuss it
... thanks

Wendell: this is an interesting discussion
... like to remind myself and the group
... there are other W3C groups which have these sorts of scopes
... the Privacy CG meetings Thursdays and has representations from other browsers
... only two and half browsers who show up
... go to where the action is
... this is a business group, and important to talk about issues here
... but there is the CG
... and some of us also sit on the Privacy Interest Group
... where we do horizontal review of web features
... and give formalized responses
... we are working on the privacy thread model
... James, you were looking for formalized principals

<wseltzer> https://w3cping.github.io/privacy-threat-model/

Wendell: that document has potential to become what you are kind of looking at
... wrap up with that
... these ideas are good; but this group is not the only place where these discussions are talking place
... encourage you to participate in these other groups

Wendy: Thanks for the cross-group advertising

Brad: Mostly getting to Ben's question about signing into Chrome or gmail
... trying to find the right reference material

<blassey> https://www.blog.google/products/chrome/product-updates-based-your-feedback/

Brad: will drop blog post that explains this
... Ben, you are referring to project launched last Fall
... people are shared devices were getting into Frankenstates
... parents, child mixing accounts
... on shared devices
... it is a feature to provide consistency across that
... what it does not do is turn on synch
... if you sign into gmail
... it will make sure the signed in user for Chrome is consistent with gmail
... but does not do the synch
... data from one account does not bleed into another account accidentally
... blog post explains it a lot better than I do

Wendy: Thank you

James: thank you
... responding to Wendell's point
... thank you for sending the privacy model from PING
... will incorporate into the success criteria
... important to gain that consensus
... huge number of groups across the W3C
... it's not reasonable for all of us to contribute to all these groups
... with that in mind, this group was specifically formed to contribute to privacy sandbox

<alextcone> Re: Brad L's point - Note again that default opt-in vs default opt-out is not the question. The issue is that those settings allow for transparency and control. Privacy Sandbox does not have the concepts of transparency and control. In fact it is a response in part to say that transparency and control are not possible.

James: so therefore onus is on Google to come to this group
... because that is what was specifically requested back in January

Bleparmentier: Across multiple browsers, or across multiple browsers?
... in two years, no third party cookies
... can Google do cross-site advertising
... whereas we will be working with...
... no privacy available
... are you going to do @
... while all of us work with cohort

scribe: or is it not the case
... there may be a big issue here

Brad: to that point, Google will use same capabilities for targeting across third parties
... that is the goal here
... I got on the queue to make a different point
... specifically to the success criterial doc
... James, you are talking about consensus
... as a business group there is no consensus mechanism for a document; would have to happen in another group

Wendy: do you have another response, Michael?

Michael: after third party cookies are eliminated
... everybody will be able to target you the same way
... everyone will use the same APIs

Brad: Does that answer the question?

B: change the condition
... it is not written
... you are giving us guarantee you are not requiring @

Brad: we will have to update the privacy white paper to reflect that

Wendy: if there is more specific information people want to raise for the agenda
... I heard the success criteria
... some commentary and issues on that

Marshall: just want to reinforce what Brad said
... absolute intention is a level playing field
... find a path to that
... be clear it will be for a level playing field

Wendy: thank you

<wseltzer> [adjourned]

Wendy: in the mean time, raise issues, discussions, and see you next week

- DRAFT -

Improving Web Advertising BG
21 Jul 2020

Attendees

Contents

Agenda-curation, introductions

2 TURTLEDOVE/SPARROW proposals: Gatekeeper and Proprietary

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output

- DRAFT -

Improving Web Advertising BG 21 Jul 2020

Attendees

Contents

Agenda-curation, introductions

2 TURTLEDOVE/SPARROW proposals: Gatekeeper and Proprietary

Summary of Action Items

Summary of Resolutions

Scribe.perl diagnostic output

Improving Web Advertising BG
21 Jul 2020