Improving Web Advertising BG

Meeting minutes

Wendy: Welcome. We are still waiting for people to arrive

Wendy: you do not need an account; click the webex link and join from there

Wendy: I see people joining now; give a minute more
… Let's look at the agenda
… the bulk of meeting will be a presentation from Jaime Perez and Wendell Baker on Yahoo's experience with Google Chrome's trust TOkens

Wendy: Jaime are you here?

Jaime: Yes

Introductions and Agenda Curation

Wendy: Let's start with introductions; anyone who is new to group and would like to introduce yourself

Bubul: I will get started. I represent Coupang.com
… we are the largest e-commerce retailer in Korea
… Karen invited me as a guest today
… we are in process of joining

<npd> welcome

Bubul: very excited to be part of discussions moving forward

<bmay> Welcome, Bubul

Wendy: Welcome, Bubul

Yahoo Experience with Google Chrome's Trust Tokens (Jaime Perez and Wendell Baker)

Wendy: anyone else?

Wendy: let's move over to our presentation on Trust Tokens
… Jaime, do you want to share your screen?
… and introduce the subject
… we heard a bit about the subject a few meetings back
… and in our goal to get experience with the tech, and questions you are encountering

Jaime: Can you see my screen?

Wendy: yes

Jaime: Thank you everyone and thanks for inviting us
… rough time check?

Wendy: meeting runs for one hour; this is the main subject
… so if you have presentation and Q&A

Jaime: Around 20-25 minutes plus some Q&A time
… Today we will talk about our experience with Google Chrome Trust Tokens

<wseltzer> Yahoo slides

Jaime: Who are we; Jaime is sr. principal architect, identity systems for Yahoo
… and Wendell Baker

Wendell: I am an architect in ad systems and look after projects like this
… which is our cookie-less strategy; this is one example

Jaime: on top of that, there are many other off-stage who helped with this proof of concept that we built
… We will cover why we did this; what questions we wanted to answer; two, what we built and how it works
… and lastly share what we learned and areas of concern
… before we delve into this
… We will first call out what we found

<wseltzer> [slide 4]

Jaime: We were able to successfully build a proof of concept on Trust Tokens
… but we were unable to show clear proof of value

<wseltzer> [slide 5]

Jaime: there are areas of concern and to simplify
… and protocol evolution; we don't know effect of multiple proto versions
… those are some of concerns at a high level

… the Motivation; what was our expectation a year ago
… Google is going to replace some aspects of 3rd party cookies
… we wanted to see if this all works
… At Yahoo we wanted to understand what kind of use cases we can address by leveraging the Google Trust Tokens
… where it works, where it fails, understand the use cases for the business
… does it benefit costs; and identify concerns
… Those were our questions going in
… how feasible to use Trust Tokens as gate keeper
… Now we covered the why, let's talk about the what
… >We built a proof of concept in Q4 2021
… we successfully issued TrustTokens between our own services
… Some things to cover, is what we did not do
… We did not test at scale; did not operate multiple protocols or versions - only did v2
… we did not address at production scale....
… see how it works and treated it as such
… Use case we explored
… distinguish between a real user and an imposter
… when serving ads and when user is trying to login or create an account
… We had Yahoo.com be an issuer
… It's up to issuer to decide what metric and determination to set that trust
… Trust Token allows you to capture and store that trust
… We generated a trust token stored on the browser
… Step two, user goes to aol.com
… what aol.com would do
… check the trust token from Yahoo
… and then process trust token
… step four AOL uses in whatever business logic they are trying to handled
… such as fraud or login attempts
… use trust token signal from Yahoo in aol
… if AOL is serving Yahoo ads
… if traffic is fraudulent, you want to treat differently or not charge
… this is the use case we explored
… Here is a high-level diagram of how everything worked together; different modules and components in the prototype
… High level there were three layers
… top layer is the brower/client
… middle layer is web service, wherever Yahoo.com or AOL.com is running
… on back end Trust Token service that issues and redeems tokens, provided by Google
… separate host
… issue token and receive that token
… using BoringSSL, an extended implementation of OpenSSL
… Trust Token protocols intended to use
… First step is user goes to Yahoo.com
… then load JS
… to begin, no trust tokens
… browser will create a blind nonce
… in step four sets to Yahoo
… and Yahoo propogates to trust token service
… based on that, generates a blinded sign token and returned on step 7
… and step 8 written in browser storage
… wherever you have all your cookies and artifacts
… starting in Chrome 88 and above, there is Trust Token in the browser
… you also know fact that there is PMB
… Trust protocol includes...
… issue when user is trusted
… trust tokens are available to for anyone to see
… might indicate that we, Yahoo, trust that user
… to make it harder for an abuser, always issue trust token with private metadata
… and also provide a user bit with 1 or 0
… to believe 1 is trusted or 0 if untrusted or a bot
… when page is loaded
… as part of issuing it sends that bit
… down to step 6
… to encrypt and sign
… user keys
… and decide if user is trusted or not
… now that the browser has the trust token
… how can users consume it
… step 10
… user goes to AOL.com
… consider if trusted...step 11
… when AOL loads JS
… it checks for presence of trust token for Yahoo.com in the browser
… so it exists and goes to step 13
… and since trust token for redemption
… step 13-14 encrypts trust token
… a sample redemption record
… a JSON block
… you can control
… with the private metadata bit
… when Yahoo created trust token
… this browser is believed to be a human
… that redemption record is written and available in the browser storage
… you can further encrypt via your own signing mechanism
… or protocol itself allows you to encrypt
… for this example...it was a clear text JSON blob
… AOL reads that private metadata bit
… and then consumes whatever business logic; like serve this ad, allow log-in to happen; what user wants to do
… This is at a high level of how things work
… One thing to call out, to protect the user
… when token is first issued, it is blinded
… when redeemed, it is unblinded and sent to user in step 13
… to prevent finger printing
… you could finger print otherwise
… there is some blinding happening to prevent Yahoo from finger printing the user
… Let me recap what we learned
… we were able to issue a trust token successfully
… but unable to determine proof of value
… Areas of concern
… unclear if we are using Trust Tokens for the right Use Cases
… we found the greater value when Trust can be shared with other companies
… we have AOL consuming token from Yahoo
… but we may not gain much value
… but if we can rely on trust for other corporations like Expedia and Amazon
… it would give greater value
… We also found that redeemer sites rely on at most 2 issuers
… Yahoo.com is one of the two issuers that AOL was checking
… so if we used Google
… then AOL is tied to Yahoo and Google as issuers
… if we made an agreement with Expedia, we could not use the Trust Tokens because we would need to pick up Expedia
… most of issue is to avoid finger printing but it's not scalable
… Another thing we found is any site can see that Yahoo.com issued Trust Tokens for that browser
… it was issued whether trusted or not
… user knows that Yahoo issued trust tokens for that browser
… We also found some overhead
… the backend, the brains of the Trust Tokens protocol, the SSL that Google provided
… intention that everyone uses this library?
… if we are all supposed this library, then everyone will need to do some security vetting
… Another thing we found was that the protocol version cadence wasn't clear to us
… now it's v 3
… this proof of concept was based on v2
… and v3 was not backwards compatable
… and it broke our POC
… it kept changing
… Maintain key commitments
… this trust token protocol relies on three set of keys
… one for protocol, one for signing Trust Token, and one for determining if user is trusted
… Handling failures
… multiple places where some people go wrong
… hard to debug sometimes; we would get a generic failure, "Trust Token failure"
… hard to keep the bugs clear
… To wrap up
… Third party cookies in Chrome to be problematic around 2023-2026
… Yahoo looking to use Trust Tokens for fraud prevention

<wseltzer> Trust Tokens repo

<wseltzer> [slide 6]

<wseltzer> [slide 7]

<npd> serving ads and allowing login seem like very distinct use cases with very different properties about fighting fraud. I would be surprised and impressed if the same trust signal works for both

<wseltzer> [slide 8]

<wseltzer> [slide 9]

<npd> is there a github issue tracking this question about limiting the number of issuers?

<npd> thank you for this presentation and sharing implementation experience

[missed last couple of points on last slide]

John Wilander, Apple: thank you; loved the presentation
… question on redemption side
… I was under impression that there was an iFrame necessary
… for context and only there would there be communication about redemption of a Trust Token
… I think it relates to "everyone can see the Trust Tokens" and that is a security risk

Jaime: in our POC, it was mimicking the sample demo
… with fetch API into Yahoo.com
… that's how we set up the sample POC

JohnW: surprised, thank you
… look forward to future plans

Steven_Valdez: I can try to clarify areas of concern
… I'm the developer who worked on the Boring SSL library
… our redemption can happen from anywhere
… if embedded in some third party, long tail
… no issuers or stake; embedded in iFrame
… that is not currently a requirement
… skip use cases; captcha use case has come up a lot
… also anti-fraud values to be put into Tokens
… these are valuable to share
… one thing not on list
… Chrome was running a Trust Token origin trial experimenting with API
… some design decisions likely to move into general version
… and hope to clear up other areas
… don't have a great solutions...add more finger printing vectors
… short of restricting issuers
… long-term try to find techniques to avoid restricting issuers
… one potential solutions
… redemption time
… issuer can make decision if revealing token; only allow certain parties to redeem tokens
… Trust Tokens is based on Privacy Pass
… and other versions of PP is being experimented on
… And Apple is giving a presentation on WWDC tomorrow
… use similar technology
… and similar protocols that all browsers would use
… and hopefully have story for how to support old and current versions
… we left that out of the origin trial
… maintaining key commitments
… once more browsers support these APIs
… hopefully better ways to maintain keys and make value better
… Error codes aren't super great
… without introducing side channels to add extra finger printing
… NGA as we update protocol and as people update libraries, hope to get better error stories

<wseltzer> https://datatracker.ietf.org/wg/privacypass/about/

Wendy: Thanks, Steven

Jaime: thanks for sharing what you are thinking

<Zakim> AramZS, you wanted to say Is the Proof of Value more obvious if the alternatives are not available in a post cookie world? And would broader browser adoption possibly going to change some of your conclusions, especially in regard to key commitments?

AramZS: I was wondering
… how much more valuable this becomes to your example implementation
… if the third party cookie goes away
… right now proof of value is lower
… as you consider value, how does that factor in
… and also, Steven talked about hopeful future
… of adoption by other browsers than Chrome
… wonder if proof of value goes up in that case as well

Jaime: Good questions
… that 3rd party cookies are still there is an advantage...we can validate
… while we have 3rd party cookies still available to track user properly
… it all comes down...Trust Tokens are to store a signal
… what I understood, it's up to the users or developers to answer question of what that trust signal means
… user is trusted, whatever Yahoo decides
… whatever question the developer wants to answer; what is actually Trust
… how you make determination of what to store and then you store it there
… We are trying to understand what we store there
… if user is trusted or human
… the multiple sites that Yahoo operates
… we may not gain much value
… signal might be similar among all four of our sites
… but if signal comes from a separate site
… and if AOL believes it's trusted
… we might also assume it's trusted; if this can be shared
… other use case might add some value
… might have a publisher site like vacationdeals.net running vacation ads
… then treat as a trusted user
… but what does that trust or human mean?
… what to store?
… that's what we are trying to wrap our heads around

AramZS: were you trying to prove more than 'this is just a human'
… or 'this person is of a group participating in a survey' or a demographic group with some methodology?

Jaime: Right now trust token is constrained by 1 or 0
… try to understand that question we want to answer
… only two issuers
… can only solve a yes/no question
… so have to know what that question should be
… that is something we were not to wrap our heads around
… and it's not just the dmeographics part
… not just if human, or if bot; what does that mean
… all on Yahoo to make that determination

AramZS: do you foresee different domains to offer different tokens to answer different yes/no questions?
… a yes/no demographic question for example

Jaime: we thought of having A.Yahoo.com
… each one answers a different question

<npd> good that we're directly testing the privacy risks as well :)

Jaime: we are bounded to our issuers to prevent that kind of fingerprinting
… we cannot answer more than two questions

<npd> it sounds like Yahoo considered creating dozens of issuers, each of which could issue a different bit in the user's ID, but confirmed that they couldn't be separately confirmed by the page in order to re-create the ID across domains

AramZS: not sure I understand; only issue two tokens?

Jaime: Yes, on left side issue Tokens, Google issues
… on right side, AOL redeems token
… AOL is bounded by two issuers
… on step 12
… it can check on at most 2 issuers
… let's say Amazon.com is the second issuer
… if you want to use tokens from another issuer like Google
… you cannot; have to use what you are bounded to

AramZS: very useful to hear your thinking, thank you

BrianMay: thank you for experimenting and presenting your results
… curious if you can define proof of value more directly

Jaime: if we can tie this to a monetary amount
… and then put this site to all the parties we have
… what dollar amount to benefit from implementing this
… and tie to different use cases
… like login
… we did not find much value; only within our sites
… for the apps use case, maybe some value there
… but what will we store in the Trust Token bit
… and how much value moneywise would that generate in revenue to the organization

Wendell: for us it's about the business value
… and make sure the infrastructure we build supports the business and provides value for our company

BMay: could you list the proofs of value you tried to test and give us some thoughts on opinions like fraudulent ads or proof of logins

Wendell: we did this experiment
… and showed this use case because it had the fewest disagreements
… we toyed with a lot of ideas
… Fraud is well understood as a bad thing

<Zakim> npd, you wanted to comment on how to assess proof of value

Wendell: and that's what we settled on to test the infrastructure

BMay: yes, thanks

NickD: I wanted to ad and echo thanks for testing out implementations and sharing it
… it's very useful to the community as a whole
… i think I understand the point that we don't understand what the bit means
… how it's used
… when did you issue Trust Tokens
… what assurance did you provide internally
… and respond to the end point with Tokens

Jaime: when do we issue them; you can issue them
… are you doing this for registered, or login, or non-login
… certain criteria
… up to Yahoo to determine when to issue Tokens
… for successful login attempt
… indicates user is legit and not an abuser
… that's one of things we though
… and wrap heads around a non-registered user who doesn't have an account with us
… might be a legit user
… another use case

NickD: when you ran the test, did you choose one answer?

Jaime: yes, when user logged in successfully we issued the Trust Token
… in our case it was after login

NickD: that seems sensible, although other orgs could take advantage of that
… seems like a high value moment; and is relatively easy to communicate to another site
… a fairly straight forward description

Jaime: you see the whole population of users
… and Yahoo might cover 20% and @ may cover another percentage
… another provider
… only cover this amount
… of users

Wendy: I queued up to note that this is great input to the conversation of standards
… if this is a proposal that advances to standards
… helps with availability
… protocol cadence comes into the standards conversation
… thank you for helping us to test proposals

Per (Google): to Yahoo team, great work
… I have one question
… you say one option is you issue Toke when someone has logged into Yahoo
… is that info you are willing to share publicly?
… then other entities can request a token if they get a token and can recognize as a Yahoo user

Jaime: we can discuss
… then we can also talk about ensuring that token is shared with Google or with whatever company

Per: hypothetically
… do you think Yahoo and other companies would share publicly or only with certain business partnerships?

Jaime: we would have to discuss with our project managers
… we are open to look into
… but not determination yet

Wendy: I will go to queue first with hands up

<AramZS> Noting that 'payment processed' is also a usefully strong signal, but I do think the recreation of the reCaptcha flow post 3p seems to be the strongest use case.

Vinod: I work at Amazon on supply side
… we are interested in trust tokens as well
… from perspective of making sure we have human signals available
… whether data mining; get from users or bots
… because of limitations of keys that can be used
… max of two questions you can reasonably answer
… can we use more keys on the issuer side?
… and increase number of questions to answer

<AramZS> But it does open up a whole question of an interesting low cost business of 'humanity authorizers' and how that might impact the ecosystem

<bmay> +1 Aram

Vinod: I know spec talks about not using more than 3 keys
… and publicize keys to make sure finger printing is not abused
… is that were standard can seek to see more keys
… to answer more questions? What are your thoughts?

Jaime: question for me or Google?

Jaime: like you said the protocol we experimented with addressed one question, answer yes/no
… two keys to sign the Trust token
… more questions would add value
… five, ten, 100
… where do you draw the line?

Vinod: Anyone from Google care to comment?

Steven: choice of three keys
… metadata 3x2
… seemed like enough different values for hi, low, medium levels of trust
… open question
… but don't want to make it much bigger
… need to balance between the privacy aspect and the value-add of having more keys

Vinod: so what are the guardrails; when going overboard; what is the calibration
… that is a key question

StevenV: that is an open question
… part of standardization path is to find a correct metric

Wendy: that discussion would be here and in privacy pass over in IETF

Mariana Raykova: I designed crypto for tokens
… curious how you decided the private metadata
… and did you run into situation where you changed the private bit
… how did you evaluate the value of these tokens; observe users who were malicious

<npd> I don't know an exact answer for that number, but I would hope that it would be driven by trying to support the use case of 'is this a user that is trusted' vs. 'can we just duplicate a cookie id'

Jaime: what we did not do
… whole life cycle of user from trust...was either 1 or 0
… might have expiration dates; we did not do that
… thinking was maybe obsolete the old trust token
… or generate enough trust tokens
… like 100 TT
… and that value is not going to change
… maybe up to a one @ value of change
… we did not explore

Mariana: for future iterations, should we emulate realistic behavior
… where you are trying to flag malicious users
… and if they have guessed what happens
… the whole power of the metadata is to flag
… malicious users without flagging to the whole world
… wondering how to capture that

Jaime: We are looking into this
… if user is a good user
… then that is handled right there
… something we want to explore, but did not do as part of this experiments

Wendy: Thank you
… we are at the end of our time
… John and Steven had other comments
… I invite you to continue the discussion on Github
… is there a specific place where you would like specific feedback or questions?

Jaime: i will get back to you; talk offline about where to comment with feedback

<wseltzer> https://github.com/WICG/trust-token-api/blob/main/README.md

Wendy: and if input on Trust Token API; sharing pointer to WICG

Jaime: or direct to Wendall
… thank you for inviting us

<svaldez> There's a hope that this gets adopted by the Anti-Fraud CG, so having this presentation there would be useful.

[adjourned]

Wendy: Thanks, Karen for scribing

Wendy: see you in two weeks

– DRAFT –
Improving Web Advertising BG

07 June 2022

Attendees

Meeting minutes

Introductions and Agenda Curation

Yahoo Experience with Google Chrome's Trust Tokens (Jaime Perez and Wendell Baker)

Diagnostics