Improving Web Advertising BG

16 March 2021


anuvrat, apireno_groupm, AramZS, arnaud_blanchard, blassey, bmay, Brendan_TechLab_eyeo, daviddabbs, dialtone_, dinesh, dkwestbr, dmarti, ErikAnderson, eriktaubeneck, GarrettJohnson, gendler, hong, imeyers, jdel, jeff_burkett_Gannett, jrosewell, Karen, kleber, kris_chapman, lbasdevant, Mike_Pisula, mjv, mserrate, nics, pedro_alvarado, shigeki, wbaker, wseltzer
Wendy Seltzer
Karen, Karen Myers

Meeting minutes

Wendy: Let's start by looking over the agenda
… we can return to data gathering and FLoc Q&A; look at dashboard highlights
… and look at Valentino's question on second price auctions?
… Any other business or items to queue up for future agenda?

Agenda-curation, introductions

Wendy: We are at a good point to start out with introductions
… Do we have any new participants [or guests] to introduce

Chris Martinez: I am director of sales at Hearst Television

Tom Carroll: Publicis Media

Wendy: welcome, Tom
… we use the irc channel to queue
… if you would like to speak or raise comments

Data-gathering and FLoC Q&A (continued)

Wendy: a few weeks ago we had some discussion on FloC

<jrosewell> https://github.com/w3c/web-advertising/issues/109

Wendy: We didn't get through all of the questions people had then
… so we want to open again for discussion
… if people have open issues to discuss today?

Brian: I think there are two general areas of instability
… on the one hand you have algorithms that will change infrequently
… they will change the entire landscape
… and on other hand you have users whose browsing behavior will affect the cohort assignment
… Can we get a sense of the volatility
… is user someone who has little history
… observed by browser moving in/out of cohorts frequently in browser
… or is this cohort one with such ephemeral activity that it sees a high population turnover

kleber: Michael Kleber, Chrome. Let me see if I can answer those questions
… One question is can browser provide additional info alongside a person's FLoC
… to see if person is firmly in this FloC
… or moves over time
… Answer should be no
… attempt is to show some clusterings and some amount of recent activities
… FLoC represents recent activity rather than a collection of people
… intent is not to know people
… but to show implications of how likely you are to advertise to those types of people, depending upon how a FLoC tends to behavior
… browsers ought not show behavior of people in the FLoC; I would not favor that
… in terms of volatility more generally
… It does seem quite likely that some FloCs will be more or less volatile
… if your FLoC is calculated every week
… then sites people go to weekly vs occasionally may put people into different FloCs
… predictive value of FLoC should give you different information
… and get onto different FloCs
… On other hand, could be possible to have two FLoCs right next to each other that you would want to bid on
… people may float freely between them
… could be different in clustering, but not affect bidding so much

Brian: That makes sense
… Trying to get a sense of what kind of resources to put in to understand what a FLoC means in a particular context

<wseltzer> https://github.com/WICG/floc

Brian: and how to limit the amount of resources to advertise to the people we want to get to

Michael: Good question
… I hope during the origin trial you can do experimentation to figure out
… depends upon the choice of the FLoC clustering algorithm
… such as in M89 when we turn origin trial on
… that is the start of FLoC
… Everything here is subject to analysis
… and further experimentation
… Anything like that, I think answer is different clustering algorithms may give different feedback
… If you find out this FLoC clustering algorithm is better than that one
… where this other algorithm is more flighty and ephemoral, then it would be helpful to us to know

Brian: Will there be some information about version numbers as you go through the origin trials?

Michael: yes, version string corresponds to clustering algorithm

Brian: Would we be able to request a specific algorithm during the trials?

Michael: no
… not during trials
… try different types of data

Brian: Two more questions
… a user who clears browsing history regularly will have different experience than one who does not clear
… different clustering process
… will you publish data on size, clustering processes

Michael: yes and yes
… we certainly will publish information about each clustering algorithm
… with what slices FLoCs tends to be
… size, number of FLoCs
… but not necessarily about each FLoC individually
… Your first question?

Brian: Is there some kind of threshold the user has to get to?

Michael: yes, if you clear your browsing data, your info will be erased
… at outset there will be some number of sites that a person visits
… I think it's three different web sites
… but details of origin trials are open to change

Brian: Any possibility of getting young, middle aged, mature assignment?

Michael: no, I don't think so

James: Deepak, on the 23rd
… looking at this diagram to understand the 95% figure
… that got distributed around the media
… we weren't really clear on how that 95% figure was calculated
… and that 95% figure is a justification for a 5% reduction to the current solutions
… It's important to get a better understanding of that
… Can you dive into that now, or later, or if another proposal?

Deepak: Do you have any specific things? We are addressing your questions
… if not specific, I can do another attempt
… Last time there was some question on the cookie models
… that was not an accurate statement; prediction models do not use cookie model
… FLoC system was scored assuming no cookies
… one skew assumed no cookies
… you could actually see an improvement in performance

<joshua_koran> @Deepak were training models using ANY identifiers? Afterall Google has many other identifiers not stored in cookies.

Deepak: our system current hampers the FLoC system due to skewing
… Let's go back to basics
… Published numbers is for cohortization of audiences
… if you have audience of people interested in music, sports; these are general advertisers
… interested in remarketing and fine-grained targeted
… there is some sort of corhotization happening today on the DSP side or other adtech or publisher side
… In new world, cohortization happens in the browser
… could happen DSP side, but details matter here
… Instead of going deep into testing of algorithms
… I cannot show you the code
… hence origin trial to test
… I do agree that a smaller player may not be able to test
… but if they are using cookies, they do have some intelligence
… the same tech in theory could be used in FLoC

James: If can summarize
… you are saying that cohortization happens within the browser or DSP today
… so a figure of 95% effectiveness was being done in the browser
… and we are asking for understanding of how that 95% was calculated so everyone can have an understanding
… it's data that I'm curious about
… you don't have to publish the whole source code to work out the variables used
… seems not too burdensome to put out
… prior to the origin trial that will be expensive for any organization to engage in

Deepak: at Google, as I'm sure with any other major DSPs, SSPs, A/B experiments are done
… with control and experimental traffic
… from advertiser POV there are benchmark metrics
… set and tested against to make sure changes generally affect advertisers and publishers that is consistent with the design of the xperiment
… Usual philosophy is to change one variable at a time
… In this system, FLoC was the @
… for evaluation purposes, for control and experiments
… for cohort of advertisers, on both sides, you count the number of conversions for each specific advertiser
… In control you have set of advertisers who spend money, have clicks
… and you have smaller set of advertisers who could vary slightly
… or could be the same
… for the tail advertisers
… for the treatment, we have advertisers with ads, clicks and conversions
… You take your spend and your conversions
… conversion per dollar
… take all the conversions and spend, get conversion per dollar
… for all advertisers in control and treatment groups
… there are various benchmarks to control
… when you have conversions, there are apples to orange conversions
… if one advertiser sells high-end vs. another advertiser selling lead-gen
… one technique is called Mantel–Haenszel
… even more techniques used
… that use statistical sampling
… when you use these techniques to use control vs. experiment
… if control is one
… better for advertiser
… if control less than one...
… in this case, it was .95

James: what did you mean by conversion

Deepak: in many advertisers
… advertisers are either charged using impressions or clicks
… so a conversion is essentially a definition by advertiser as a sale on their site
… a pixel that fires
… or it could also be some sort of shallow conversion
… where someone fills a form for lead generation
… some generation where an advertiser tells that a person did an action from which ROI could be computed

James: what was the sample size?

Deepak: we have done a ton of tiny experiments
… we did cookie splitting experiments
… both of them are equal
… advertisers spends
… depending upon how much spend advertiser has
… let's say .5, 1% or .1% of traffic could be diverted

James: so just over zero of one percent
… were advertisers opting in? Or chosen at random?

Deepak: all the advertisers were chosen for a tiny percent of the traffic

James: thank you for providing a tiny bit more insight
… I will yield to others

Deepak: thank you, James

Wendy: note that I have closed the queue for now

Erik: good morning
… a few questions to plow through
… probably softballs for Michael
… How many bits for FLoC?
… expectation for tens of thousands browsers in a FloC, a few billion sessions?

Michael: initial origin trial does not specify bits for FLoC ID; paper on which Deepak was an author ... [missed]

<jrosewell> Thank you. I would appreciate more details concerning the previously announced "95%" experiment added to this issue - https://github.com/w3c/web-advertising/issues/109

Michael: SimHash, implemented in Chrome
… first caluclate large hash of user's browsing history
… then only use first N bits of that hash to make sure cohort is large enough
… Different FloCs have different numbers of SimHash that go into the calculations

Erik: the size of the post-image before you merge them together?
… SimHash has map into some 2 to the N space

Michael: that is like 50 bits
… do some merging

EriK: a lot of merging happening

Michael: yes, until cluster gets to thousands of people

Erik: My understanding lexigraphically, Simhash in the @ space
… have you considered a progressive algorithm where you merge it with haming space v. lexographic space

Michael: we have looked at many algorithms
… one we have chosen is nearest neighbor
… not hemming and not lexigraphic
… every cluster is every person who have certain prefix for 50-bit hash

Erik: That's helpful

Michael: sort of lexigraphic; all sizes have power of two; binary tree structure of SimHash

Erik: Will browsing history include synched across browsing sessions

Michael: For initial version for origin trial, only looks at device you are on
… we can talk about merged browsing histories in future, but not now

Erik: how often does it update? On demand?

Michael: Doesn't update more than once a week
… all of these things are subject to change

Erik: just asking specifically about the origin trial
… Imagine two sites a browser visits weekly
… envision N visits over certain number of weeks; that could become an identifier
… have you considered that?

Michael: Yes, we have come up with some potential mitigations
… visiting nearby FLoCs for example
… not in very first version
… more of a finger-printing vector

Erik: Assume there is room to dive in more deeply to that question on GitHub

Michael: absolutely

<wseltzer> GitHub issues for FLoC

Erik: See if this is in line with what others are thinking
… not a need to understand FLoCs, but use them more for ranking
… like % on FLoC A or FloC B and score models against it
… more about the outcomes I care about?
… Does that jive?

Deepak: yes
… two ways to look at it
… look at learning models
… another way is someone big into cohortization
… could use some kind of taxonomy on top of it
… like how people currently use cookies...and heuristics
… could be one size fits all
… or a replacement for cookie replacement system

Erik: Thanks, that's all I have

Valentino: this seems a system
… not to reverse engineer the shoe and handbag lovers
… more a ML learning model
… on one space of cohort IDs and on other side browser IDs
… all memberships are going to be updated
… if you are going to train ML on other end
… relationship between cohort ID and conversion
… important to get a conversion
… over time, domain name is going to be one of features wit a URL
… perhaps semantics of page, topics on page being visited
… nothing in specs on how we are going to receive notes that you are upgrading the algorithm
… nothing about rollouts of the algorithms
… talk about that

Michael: there is something about this
… the info you get from browser when you call the interest cohort API
… has two things
… one is which cohort you are in, a number
… the other is a string, a label for the particular clustering used
… I think it's called a version in the spec
… its version will be its own clustering algorithm
… so by watching that, you will know which algorithm it was
… when we call it, will always know which one was applied
… in future when two clustering techniques running at same time, would get two version strings
… one for some, one for others

Valentino: I think readme does not contain version as all; it's in the .bs file

Aram: we need more data on the readme for this
… you have an answer clearly so it should be on there

Deepak: any ML algorithm, @ rather than FloC ID
… if small traffic, systems will gradually adapt

Valentino: are you thinking anything around speed at which IDs will be updated
… if you update in a single moment, all the models of the buyers will be outdated

<wseltzer> [the .bs file generates this doc, https://wicg.github.io/floc/ I think]

Michael: I think is more of a Chrome question
… Valentino, a good question on how to introduce

… nothing we have written down on how that transition works
… expect we would want to do something like for some percentage of people, give new FloC for new version
… for four to six weeks in Chrome release
… and then for next release
… we're happy to have conversations about that
… FLoC updates expected once in a while; very slowly to make your lives better

Valentino: can you keep the random vectors?

Michael: yes, pseudo algorithm uses a fixed number
… different algorithms will be entirely different and will change
… cannot make any promises about new and old algorithms

Valentino: this will cause retraining needs for ML

Arnaud: Deepak, you mentioned last time about frequency capping used in both calculations
… so it's a feature, yes?

s/user feature
… FLoC doesn't support
… We ran experiments on our side of not capping displays to users
… the impacts are several multiples of the 5% delta you mentioned
… you used @ in your testing
… do you agree testing is not representative?

Deepak: We are getting into the weeds
… what does Google do for Safari frequency capping today
… we were thinking
… frequency capping, we went through a bunch of ideas
… one idea was to use first party cookies to perform frequency capping
… info is out in a blog post
… for Safari, there is a key insights
… most users visit X number of sites per week
… they tend to visit same sites over and over again
… the way frequency works for Safari
… visit this site, see some features and do some adjusting
… in theory, you could use same thing on Chrome side
… we did not adjust frequency capping
… because we thought in theory we could use same solution as Safari
… for average advertisers, these effects were pretty consistent
… We can debate about it, but that is answer I have

Arnaud: so users cannot do deterministic frequency capping
… some users will see thousands of ads in same day
… since that fell through the crakcs

… last time you mentioned FLoC will use FLEDGE
… don't you think in this case, which would be vast majority of marketers
… more important than the 5% you mentioned since you have to use FLoC through FLEDGE

Deepak: if FLoC is applied through FLEDGE, not same performance? Is that correct?

Arnaud: somehow use part of FLEDGE with FLoC; quoting previous minutes
… meaning performance is impacting by whatever FLEDGE is doing in the browser
… do you think impact is lower than the 5%?

Deepak: trying to understand more clearly
… there is a frequency capping using FLEDGE like ideas
… way to do it storing inside the browser
… I'm unable to understand if when using that technique there would be lower performance?
… If you do deterministic technique, how does that work?

Arnaud: If you do frequency capping as no brainer, which most advertisers and publishers do that
… how can we do that through FLEDGE
… how much will it impact, on top of 5% downlift
… going from cookies to FLoC
… the reason for this
… is you did @
… you advertised this nubmer
… as impact that is more massive than the tiny use case you mentioned

blassey: I want to push back on that

Arnaud: I want to push back on that

<jrosewell> Doing the frequency in the web browser will require the web browser to decide which advert gets shown. The rest of the ecosystem will not be involved.

Arnaud: we need everyone to know; it's too important to leave as a conversion between us on this call

Brad: they did not represent it as you say; not their fault
… hammering on this point, is not what they said, and is not useful

Arnaud: Can you share where the details are?
… we did not find anything that proves what you just said

Deepak: I explained the A/B experiment

Arnaud: length of test, marketers you used, conversions you used; leads
… give us the proper sense
… I am not arguing the quality of analysis

<jrosewell> What other factors were present? How representative are these other factors to the real FLoC world?

Deepak: Trying to understand what would satisfy you?
… If I write A/B experiment, and if it's unclear
… then I will write that
… what Brad said, we did not represent what we wrote incorrectly
… perhaps it was interpreted differently
… be clear about that
… I want to be constructive; I will publish all the details
… I will adhere to true scientific spirit

Arnaud: good to go back to journalists who got it wrong

Wendy: I want to step in
… on where we as BG are trying to go

<GarrettJohnson> To make sure media does not misrepresent your point, you need to provide more than a sentence and you need to provide the limitations of what you are doing. I don't think Criteo is being unreasonable here.

Wendy: appreciate where this dialogue brings data to conversation and to raise new questions about proposals
… Let's keep focused on what other information we need and how we can provide that
… to move the conversation forward
… I know I owe a scheduling for a further ad hoc meeting
… because we still won't get through all of the queue today

<jrosewell> "Our tests of FLoC to reach in-market and affinity Google Audiences show that advertisers can expect to see at least 95% of the conversions per dollar spent when compared to cookie-based advertising." - https://blog.google/products/ads-commerce/2021-01-privacy-sandbox/

Arnaud: more details being published and the raw data would really be appreciated

Wendy: just a few minutes left

Kris: I have hopefully an easy and quick question
… I was thinking about FLoCs adding in a new finger printing surface
… Will Chrome remove it after origin trials
… thinking in terms of privacy budget; how to compensate for new finger printing surface?

Brad: different folks will want to use diff portions of web API surface
… for each indiv site they can use what they want up to limit for identifiability
… we won't remove APIs from API surface

Kris: Will privacy budget come out around the same time as FloC?
… ballpark same time?

Brad: looking at privacy budget enforcement coming later than when FLoC comes and cookies go away

Wendy: I see Valentino suggesting that we start with presentation around the second price auction
… thank you for that
… I watch for a poll email
… for side meeting on further FLoC conversation
… and of course glad to see plenty of issues being raised in the WICG repository for those discussions to continue

Angelina: just one quick question
… Thinking about FLoCs, several audience segments that don't fall into certain behaviors or combination of behaviors
… such as high net worth individuals looking at financial sites and luxury
… Is there ability to combine cohorts or ways to create audience segments not based on domains
… around behaviors
… and also for minors and adults
… there are several advertisers who are not able to adverise to minors at all, even if minors have similar adult behaviors
… may need an additional age layer

Michael: sure
… FLoC is good for some things, but not for other things
… discussions around 95% number in the experiment
… FLoC is good to retarget certain types of audiences
… frequency capping, knowing a user's age is also not useful with FLoC
… for specific types of audiences Angelina mentioned, we hope the origin trial will include experiments

<jrosewell> What other solutions have been considered that might be less impactful?

<wseltzer> [adjourned]

<mjv> This is from the Google blog: "Results indicate that when it comes to generating interest-based audiences, FLoC can provide an effective replacement signal for third-party cookies."

Michael: to determine which use cases FLoC is good for, and what FloC is not good for

Wendy: a good place to end
… thanks, Karen for scribing

Minutes manually created (not a transcript), formatted by scribe.perl version 127 (Wed Dec 30 17:39:58 2020 UTC).


Succeeded: s/behavior/browsing behavior/

Succeeded: s/Michael Kleber, Chrome:/kleber: Michael Kleber, Chrome./

Succeeded: s/it's/Everything here is/

Succeeded: s/FLoC/FLoC clustering algorithm/

Succeeded: s/@/Mantel–Haenszel/

Succeeded: s/how much/FLoC ID; paper on which Deepak was an author/

Succeeded: s/1,000/thousands of/

Succeeded: s/unisticks/heuristics/

Succeeded: s/list/for this/

Succeeded: s/@/Safari frequency/

Succeeded: s/@/deterministic/

Succeeded: s/@:/blassey:/

Succeeded: s/sites/sites and luxury/

No scribenick or scribe found. Guessed: Karen

Maybe present: Angelina, Aram, Arnaud, Brad, Brian, Deepak, Erik, James, Kris, Michael, Valentino, Wendy