W3C

– DRAFT –
Improving Web Advertising BG

13 July 2021

Attendees

Present
apireno_groupm, aramzs, arnaud_blanchard, bLeparmentier, bmay, Brendan_IAB_eyeo, btsavage, charlieharrison, dialtone, ErikAnderson, eriktaubeneck, GarrettJohnson, hober, imeyers, jdelhommeau, jrosewell, Karen, kris_chapman, lbasdevant, mallory, mjv, mserrate, nics, nlesko, seanbedford, wbaker
Regrets
-
Chair
Wendy Seltzer
Scribe
Karen, Karen Myers

Meeting minutes

Wendy: audio working?

[Yes]

Wendy: Looking at the agenda...

Agenda-curation, introductions

Wendy: Agenda curation and introductions; followed by
… a presentation on the shared storage API
… Josh Karlin will speak
… other questions asked have been addressed at the FLEDGE WICG meeting
… and the minutes of that meeting were linked for people's attention
… and probably follow up questions to that discussions will continue in the WICG-TurtleDove threads
… Then dashboard highlights, agenda+ and AOB
… anything to add to agenda or a future meeting?
… Any introductions? Anyone new to the group who would like to introduce themselves?

Shared Storage API Draft [Josh Karlin] https://github.com/pythagoraskitty/shared-storage

Wendy: Ok, let's move to Josh
… are you ready? Thanks for joining us

Josh Karlin, Google

Josh: Thanks for inviting me; the API is new
… give a brief overview of what its intent is

<wseltzer> https://github.com/pythagoraskitty/shared-storage

Josh: ask for some feedback
… we plan to get it into a Community Group
… and have regular calls about it
… As you are aware, Chrome and other browsers are attempting to do partitioning at site level; to prevent cross-site tracking
… privacy documentation
… benefits to prevent cross-site tracking
… but also impacts third party widgets and personalized advertizing
… these APIs have to poke the smallest holes
… cross-site date does eventually leak, but do as little as possible
… poking holes, but minimize
… one API is shared storage
… examples, cross-site A/B for lift and reach measurement
… Let me talk about what it is
… In this world of everything being partitioned, it is one thing that is not
… [missed]
… something must be different
… You can write to shared storage ffrom anywhere
… but only read from it
… Like JS environment
… a pure functionalist
… call or read all you want on shared storage
… assimilate data across sites
… can read it, but cannot tell anyone
… what are various ways to take useful information about users
… and do it in a way that is privacy safe
… two gates we are proposing
… Simplest is to think about an aggregate
… aggregate metrics
… here is some shared info
… info about user
… send an aggregate report to server
… and say, give me aggregate, like an Instagram of what they look like
… an aggregate of all the users
… you can imagine how this can be used for reach
… Each time there is impression I can write to shared storate
… once a week...find out how many times a user saw a certain set of ads
… and write report about how many times user saw the ad
… then the adtech can reach out to ad reporting server and find out how many times ad was seen in total
… Other things you can do with this
… a matter of performing an aggregate report

<MpalmerGroupM> Is Josh sharing slides via a different channel from the webex?

Josh: with enough reports and details, even in aggregate you can learn things about people

<wseltzer> [no slides]

Josh: If you have questions on aggregate restrictions, I can put you in touch with my team
… Lift report
… Open up shared storage

<MpalmerGroupM> thanks wendy

Josh: and have it return one of five
… list of URLs
… going to provide shared storage, and have it display in fenced frame

<wseltzer> [Shared Storage explainer ]

Josh: work its doing is choosing a URL to display
… not be a leak of info from shared storage to worklet
… worklet returns an opaque URL
… if this is an A/B experimment
… and this user, using simple math, user should not see ad in question, so give them A instead of B ad
… this would work cross-site
… works like a generic filter
… based on what you know about cross-site info, based on info
… Cannot give it 100 ads or the world of ads
… but narrow to 2-5 ads
… use if for that
… that is pretty much it in a nutshell
… It is unpartitioned storage, you can write to from anywhere
… slow leakage
… get to it via URL reporting or from fenced frame
… so that's it
… Any questions?

Bmay: A couple questions
… can do all at once or one at a time

Josh: one at a time

Bmay: can you say what an append call does?

Josh: concatonates to a string

Bmay: Are keys ordered?

Josh: I don't know

Bmay: since API returns an empty string; doesn't make sense for key to have no values
… should there be a call to set with not value

Josh: yes, something to discuss; good for Github repot

Bmay: Failure responses it provides?

Josh: tricky; when creating worklets
… if unable to load it
… site needs to/wants to know if worklet got created
… want to provide that kind of response but it also leaks info
… script would abort if information in shared storage
… some user behavior; could abort
… tricky thing to decide what type of error reporting to provide
… Not set in stone yet, but is addressed in the explainer

Bmay: ok, I'll take a look

csharrison: within the shared storage worklet
… you could have try-catching for errors
… and on your script
… could be possible if we can't just return errors back to JS

Josh: reading and writing to storage
… reads could report errors to the worklet itself

Wendy: thanks

Bleparmentier: basic question
… when we want to to A/B testing cross-site
… what we are supposed to do is put some kind of user ID in URL....and then be able to use it
… and then do something else to add user to a population? How do we initiate?

Josh: imagine giving user a seat
… conditional list
… give user seat and use to calculate what A/B to give

Bleparmentier: in FLEDGE in fenced frame
… how can you make the two of them interact
… if in FF
… what can you use or not
… for instance, you might want to put a bid to be zero
… is this kind of thing possible within the fenced frame

Josh: you're asking if we can retread storage?
… this is under discussion
… should be topic on Github repo
… I don't know if we are comfortable with storage to be read directly
… not a problem in the FF
… and aggregate reporting
… but if want to read it, not sure

Bleparmentier: the example for A/B test, want to read it; not bid on user
… I need some kind of interaction to put it to zero

Josh: That is super good feedback

Blemparmentier: I have other questions and will put on Github

GarrettJohnson: Really glad to hear that Google is starting to support lift experiments in privacy sandbox
… It's really important
… if you are thiking about a hold-out experiment, an A/B test
… so using a PSA in control group; or would this handle ghost bids or ghost ads?

Josh: you are beyond my knowledge

charlieharrison: if you are in counter-factual group, you show replacement ad
… should not negatively impact
… replacement ad needs to log twice
… its own measurement and would have been other ad if user was in A group
… I think system should be able to handle cases like that

Josh: [missed]

Garrett: happy to hear that
… you need parallel reporting systems, one for ads and one for ghost ads
… Quick question on randomization at user level
… can you do impression level randomization?

Josh: can you explain an example

Garrett: impression level are used to explain return to frequencies
… so evaluate my spend
… if eligible ad is ready to be served, so flip coin for @ or control ad

Josh: seems like this could be a local decision

Garrett: not sure I understand
… saying that purpose of user design is to be consistent across web sites
… but not...
… how to connect to measurement side
… Thank you for working on these things that I and other advertisers think are important

Ben: I think that this sounds like it would fully support two of use cases
… seems to encapsulate those

<wseltzer> use cases

Ben: on lift case, sounds similar to cross multiple domain lift proposal that FB proposed a year and a half ago
… shared storage worklet simply decides which ad to show
… then rendered in FF
… so you don't know which one was chosen

Josh: correct

Ben: mechanism to select small number of ads is exclusion targeting
… if you have already bought something or installed a mobile app, you don't want to keep seeing same ads
… by checking some shared storage
… has this browser bought this product, so please don't show this ad, show another one
… shared storage, check and if product bought, return default

Josh: yes, you could do that; cross-site info

Ben: you mentioned limitation on number of ads?
… where is number coming from and what are constraints?

Josh: Fenced frames are not a panacea
… linked to destination
… if we allow five ads...have [missed]
… and leaked to destination of ad
… each time user clicks on ad
… those log five bits get leaked

<MpalmerGroupM> The ability to check if a product or service has already been purchased would be very well received by advertisers

<GarrettJohnson> I thought that FLEDGE interest groups could have the property that a user is dropped once they visit a URL i.e. shop.com/purchase_confirmation . Am I wrong?

Josh: if hundreds or thousands; defining number is important
… we will push to make it as small as possible

Charlie: another important thing is protecting against micro targeting
… that's cross-site; allows microtargeting campaigns
… we may want to avoid; part of design decisions of FLEDGE

Josh:

Ben: Log 5?

Josh: log base 2
… how many bits could be transferred

Ben: clicks connect to a desitination; what does site learn?

Josh: learns URL of FF, and what other input
… URL, size of FF
… and that's it
… URL is one of possible 5
… could customize and make it
… if attacker
… customize only to come from this site and go to this destination
… so I know user got this particular one of the five

<aramzs> @GarrettJohnson - I don't think that is in the current version of FLEDGE? But one of the proposed TURTLEDOVE extensions?

Ben: why does destination learn URL of FF?

Josh: FF can represent whatever data the embedder wants
… Embedder can choose Five URLs and can be custom

Ben: so pick whatever path...destination

Bleparmentier: Five URLs represent a lot of advertisers
… we have 20K advertisers
… within one shared storage
… we have several advertisers

<GarrettJohnson> @aramzs. Thanks. Exclusion targeting seems super valuable to both advertisers and users.

Bleparmentier: understand you want to limit entropy

Josh: shared storage is per origin
… origins collaborate and could use shared storage
… I have not thought about how multiple networks would bid
… and the one that winds
… then call shared storage to do filtering
… a possible path
… if you want to call a page and shared storage, sounds like a single adtech
… have not thought about multiple ad techs
… would do via filtering

Bleparmentier: advertising for Walmart, Target, Amazon
… Criteo would....

Josh: you have to choose which of those five
… shared storage cannot be used to choose among 1000; have to choose up front and do that last pass

Bleparmentier: this is an issue I will raise
… I understand your concerns, but an important issue for me

Josh: please open...don't know what this disadvantages a small advertiser
… thinking more about A/B...ghost...size of advertiser is not important

Ben: last question; I think it functions as a filter
… you select a small number of ads; confused on timing
… when are you writing that to shared storage and when is it running
… seems to happen quickly

Josh: write to shared storage...

[too fast]
… a dictionary along with request
… chooses one right then, gives back to embedder
… URL makes no sense to user

Bleparmentier: we are giving to @

Josh: yes, it chooses one
… I should have walked through that flow
… my apologies

James: thank you, Josh
… step back a little
… what we are talking about is removing certain primitives from design of web
… and conversation highlights set of techs to deal with set of use cases
… this increases the complexity of the web
… when you presented this to Privacy CG
… you used words that caught my attention
… shared storage...you used "parties that agree not to do scary things"
… that might create a new primitive, and reduces complexity
… is that something you have considered more

Josh: That was a fenced frame discussion...for video ads
… have video bundles
… but how do you bundle video
… large objects, hard to upload, delay could be critical

<bLeparmentier> We could call it a gatekeeper!

Josh: in that scenario, maybe we have a set of trusted set of networks
… like a CDN that caches...
… but still serves off network instead of web bundle
… we want to use browser to technically enforce these partitions, using policy as little as policy

<Zakim> wseltzer, you wanted to discuss what's next

Josh: what you are talking about is being discussed, but trying to solve with tech

Wendy: I queued up to ask where you see this going
… at the very early proposal stage
… you mentioned taking to a CG next
… where would you like comments and feedback?

Josh: unlcear what CG to go to, maybe WICG, Privacy CG
… right now comments in Github that exists; not too early for feedback

<wseltzer> https://github.com/pythagoraskitty/shared-storage

Josh: that is most important early feedback; please chime in
… know that we will get it there as soon as we can

Wendy: Thanks, add that pointer to our list of proposals
… thank you

Mehul: thank you, Josh, for presenting the idea
… the aspect is @ A/B test experiment...overlapping
… some time, then decision logic becomes @
… returning those ads becomes combinant
… so it's an expense from serving perspective
… user times number of experiments running...and payload

Josh: Are you saying you might have multiple experiments going on at once
… but cannot display ad to user...so backup needs a backup

Mehul: experimentation
… running IG one
… select whether technique X ... or have a rectangle boundary
… do experiment
… of those three, overlapping, but you clear control for experiment one, two
… user...parallel experiments

Josh: per user or per ad?
… split your ads up per experiment
… can all be controlled server side
… but if it needs to be controlled @ side
… could you do at ad level?

Mehul: you cannot expose different ads to same user; looks like broken flow
… cannot switch back and forth with same user
… trying to understand client side decision
… many one
… send...if there are decisions to be made, need to send 8 answers back

Josh: intent here
… I know what you mean by A/B experiments, and understand you may have multiple going on at once
… thinking more A/B campaign, user is in control group
… that has smaller combinatorial problems
… you would essentially need to reach shared storage in FF
… same discussion and topic in the Github
… comes with fallback that there will be microfallback
… hey, Josh, get a mortgage with us
… figure out if that is comfortable
… get around corners

Mehul: might not be microtargeting
… A/B...if exposed, response time
… should not be mircotargeting...assumption of user
… if exposed...response payload
… dictionary doesn't need to be exposed

Josh: Where we run into problem of leaking information

Mehul: not a solution, but not nec microtargeting

Josh: solves problem, not saying microtargeting is...

Mehul: worth exploring more
… decision variables, even if independent...

Josh: general user cross-site experiments; like to know how common a problem that is and how many bits you need for it to be useful

Ben: this is an interesting proposal; hits a number of use cases
… what has feedback been from Safari, @ and Edge

Josh: we have not; hard sell; requires fenced frames
… not yet see other browser to say yes; it requires aggregate reporting, another heavy lift
… Don't see other browsers jumping on right away
… will take time to adopt
… we talked about ff
… with other browser; they are interested in the idea
… at the beginning stages, cannot say where they are
… Privacy CG has discussion

Ben: What is overall feedback from browsers on aggregate reporting

Charlie: no formal feedback
… John W has said things
… but nothing formal yet
… I think there are some folks from Safari and Microsoft on this call
… feel free to add anything else

Josh: and please reach out to other browsers if this is important to you

ErikAnderson: Microsoft is generally supportive
… we are also exploring alternatives or enhancements
… hope it's clear to everyone

Wendy: thanks, Erik

ErikTaubeneck: quick question
… haven't thought this through
… any reason instead of using FF to do differentially private release
… intead of five URLs in frame, use @

Josh: We have been thinking about that
… it's in the explainer
… haven't given enough thought yet
… nice thing about FF approach, unless user clicks, info is not lost
… that makes it a safer thing to do; a much slower leak
… have to...how many times to call...if you call enough times you eventually see the data
… it's hard

Charlie: all those problems come up on aggregate side as well
… could probably reuse most of same budgeting logic
… and have this local differentially private system
… my concern is if utility is there
… we can look into
… once aggregate system
… when we nail down these constants to see if it provides value
… Erik, on your side, if you have concrete use cases amenable to this, I would be interested to hear
… if you want a high privacy regime
… what you could compute would be limited

@: please create topic
… we need to hear the use cases

Wendy: thanks...to the group
… I think this sounds like another interesting new primitive
… and key to getting implementation interest is helping to spell out use cases
… and add to use cases documentation
… and make that document something that can be used by and in conversations with other implementers to gather interest
… in getting a cross-browser implementation

Ben: FB perspective
… we are not thrilled with fenced frames
… It's not ideal to not know what ads you showed your own users
… we have a feature where users can see what ads they've seen, you can't do that with FF
… cannot obtain information about what ads they engaged or liked if you don't know what ad you showed them
… and analytics are harder if you don't know
… would cause tremendous amount of breakage
… I prefer non-ff solutions
… positive to investigate
… but we're not thrilled about fenced frames

Josh: return some fuzz data to embeder
… not sure how many use cases we can cover
… to get interest to user

Charlie: imagine user targeting use case
… show ad, extra time, would be a good experience

Josh: yes

Charlie: not a UX designer
… the number of ads going into the FF is small
… may be way to present to user
… these are two ads we considered, only one got shown
… might be some way to present that

Josh: part of what I'm hearing is the analytics is hard
… that sounds problematic

Ben: not saying it's impossible
… guess we are moving from world where we at least know what ads we showed on FB
… we know it's complete; know what was clicked on or not
… from privacy perspective...have to do this
… could do a good job with anoymization
… but not knowing...another leak...it's a tough sell

Josh: so using a @ on first party...would need to restrict and then have analytics

[too fast guys]

Ben: put us in weird position of cannot do any of these use cases without logging anyhting on the site; would be a hard choice

Wendy: that looks like the end of the queue
… any final questions for Josh?

<arnaud_blanchard> thanks Josh, very interesting proposal, let's continue the discussion on github !

Wendy: thanks again for sharing this with us
… I imagine as more questions come up in issues; you can come back and talk with us again

Josh: thank you; wonderful; great questions and feedback

Wendy: unless there is anything else, we will wrap it up here and watch for agenda requests on the list

[adjourned]

<btsavage> Great meeting today

<btsavage> very productive!

<btsavage> I'd love to have more meetings like this one

Minutes manually created (not a transcript), formatted by scribe.perl version 136 (Thu May 27 13:50:24 2021 UTC).

Diagnostics

Succeeded: s/to do things/to do partitioning/

Succeeded: s/[missed]/... [missed]/

Succeeded: s/amend/append/

Succeeded: s/@/keys/

Succeeded: s/have values/have no values/

Succeeded: s/@/csharrison/

Succeeded: s/tri/try/

Succeeded: s/@/charlieharrison/

Succeeded: s/@/Garrett/

Succeeded: s/encasulate/encapsulate/

Succeeded: s/to transfer/could be transferred/

Succeeded: s/use @/limit entropy/

Succeeded: s/used/used to choose among 1000/

Succeeded: s/not see/not yet see/

Succeeded: s/@/ErikAnderson/

Succeeded: s/EriK/Charlie/

Succeeded: s/regine/regime

Succeeded: s/you cannot do that/we have a feature where users can see what ads they've seen, you can't do that with FF/

Succeeded: s/@/embeder/

No scribenick or scribe found. Guessed: Karen

Maybe present: @, Ben, Blemparmentier, Charlie, csharrison, Garrett, James, Josh, Mehul, Wendy