Web and Digital Marketing Convergence

17 Sep 2015


See also: IRC log


Chad_Hage, Reza_Jalili
marktorrance_, keiji, oyiptong, tmichalareas, wseltzer


<wseltzer> Topic: Introduction and Welcomes


<betehess> wseltzer: we need to keep track of what people

<betehess> ... to make the final report

wseltzer: says to capture minutes in this format (speaker: content) for assembling final report later.

wseltzer: does the web serve you needs as an advertiser/publisher

<gnorcie> testing 123

wseltzer: on keyboard attached devices, on mobiles, TV's... massive range of devices

wseltzer: the open web is good at this, how can it be better to help communicate with a broad range of devices. questions of analytics, data collection, etc

wseltzer: things that we don't think of as web, but we could be more interoperable that way. we have data formats to help (e.g. HTML, css etc). what else can we do?

wseltzer: could w3 or others enhance standards like html, css, etc. to serve this community (marketing) better?
... security: of the web application, user's information, and also of the trustworthiness of the ad delivery and measurement
... "can we avoid the creepy?"
... Can we avoid: 'why is this product following me around the web?'
... how do we enable choice, let people feel more comfortable with the platform
... transparency; how can we be clear with consumers + others about how we are using data so everyone can get comfortable about what is going on
... w3c "web consortium" launched in 1994 by TBLee

<tmichalareas> wendy: introduction to W3C workshop and the agenda

<brad_at_Trunica> wseltzer: how are consumers represented in the W3C, who is the advocate for the consumer?

wseltzer: Open Web Platform is built around HTML5, plus rich array of applications and interfaces
... APIs, rich media, cross-device, communications, society -- may already be a group within w3c thinking about those challenges
... Program of this workshop: surface challenges
... what's missing. We are not a coercive body, we can't make laws and force people -- instead we find people and help them work together to make things better, by making recommendations which are often adopted based on the value of interoperability
... we can't force uniformity - we take the good ideas, bring standards in and recommend them
... royalty free patent policy w.r.t. tech contributed to by members of w3c working groups
... core infra tech is free to use and build on top of
... horizontal reviews: reviews for accessibility, internationalization, languages + font-types, privacy + security
... Other ways we use (beyond workshops) to gather information that might be precursors to new standards proposals: community groups, interest groups, use cases, requirements, draft input documents

wseltzer: Interest Groups are the place where members could come togerther to try to figure out what do we need and how do we get there from a web stand point

wseltzer: working groups go through many drafts to lead to a recommended standard -- requires 2 working interoperable implementations
... Questions to answer: what's missing? who is interested to fix it? (identify a common problem, but who are the right people to fix it)
... Whose cooperation and implementation is needed to make it useful - we don't want to start something up in a silo that depends on a bunch of outside parties only to get to the end and find those parties have no interest in what we're doing.
... e.g. make sure we are getting interest + buy-in + commitment from publishers, or browser vendors, or measurement/analytics companies

wseltzer: what's missing from the web? who's interested in fixing it? Whose cooperation and implementation are needed to make it useful? Do we need new work? Not everything must come out of the W3C? Is there a draft for us to start from? Who's interested in writing or reviewing?

wseltzer: There are also other places where great work of this form is happening -- in industry, in IAB, etc. If W3C is the best place to do the work, then great; let's figure out which of our forums will work for that.
... Session 1: setting the stage
... 2 academic flavored presentations from Keio University and University of South Florida

ktakeda: Keiji Takeda presentation on Digital Marketing AntiPattern

ktakeda: work with MIT on security and privacy

wseltzer: Session 1: Setting the stage: Academic perspective from Keio University followed by a perspective from the University of South Florida

ktakeda: work with MIT on security and privacy

wseltzer: Session 1: Setting the stage: Academic perspective from Keio University followed by a perspective from the University of South Florida

Session 1: Setting the Stage

ktakeda: suggestions for antipatterns for common failures in industry related to privacy and security

ktakeda: objective is define good practicies related to privacy and security for web & digital marketing

ktageda: malvertising is one of the largest issues in advertising

ktakeda: malicious advertising is the biggest problem. as advertising becomes very efficient, it attracts malicious advertisers

ktakeda: the platform of choice to distribute malware

ktakeda: No clear solution to fight malvertising

ktakeda: the malware sites look just like real web page

ktakeda: ad networks used to show advertisements. using URL shorterning tools, the users land to programs with 0-day exploits

ktageda: ktageda: 2nd large antipattern is unchangeable persistent ID

ktakeda: the malvertisements include links to several steps, shortened w URL shortening services, that once followed install some software in the user's browser that includes 0-day exploits (virus)

keiji: maliciious software is being spread through digital advertising. Purpotrators are using programmatic buying methods and through it they are placing the malicious software on the web posing as digital ads

ktakeda: users can't control being tracked or not with persistent ID's

ktageda: ktageda: 3rd large problem is user data inspection (too much access to users' data)

ktakeda: and this user data inspection is without consent

ktakeda: DPI - Deep Pocket Inspection providers are not successful

ktakeda: an example is the use of users' phone contact list without their consent

ktageda: ktageda: 4th large antipattern is accidental data exposure

ktakeda: Many companies make the mistake of placing business and critical data on the same server/environment as the front end web servers

ktakeda: common problem: people have a large to: or cc: list when sending email

ktakeda: this opens the backend data and make it vulnerable to Google hacks

ktakeda: 5th largest problem: local optimum. focusing too narrowly on local markets. hard to generalize

ktakeda: Local Optimum should be avoided

<brad_at_Trunica> ktakeda: yahoo is surprisingly strong in Japan for search at over 30% usage

ktakeda: twitter is popular in Japan because more content-per-character can be expressed in a single tweet

ktakeda:tumblr is suprisingly strong in social media in china: 55+% marketshare

takeda: standardization would help address these antipatterns

balajir: Does Television Viewership Predict Presidential Election Outcomes?

balaji: I do research in learning patterns from applications of data to online marketing for a long time. Clickstream data, recommender systems

balaji: This research started with collaboration with Nielsen here in Tampa. We were looking to get access to data Nielsen has

balaji: TV watch data, and also what people buy. Partnership: what can we do?

balaji: It's november 5 2012. The world is awaiting news on the next US president. Who will it be?

balaji: What if we had data on who watched what TV shows in the preceding weeks Oct 1 - Nov 5. Can we predict the outcome?

balaji: think data first: opportunistic question.

balaji: Working with Nielsen, pulled together data on 547 TV programs, 165 populated counties, 49 states

balaji: Balaji Padmanabhan is his name; I will use balaji:

marktorrance: reasonable name :-)

balaji: Took a year to analyze data, transformed to 2 variables per show. Minutes per voter, and % of fans.

balaji: 49 or 165 rows depending on state/country, 547 shows. this is called fat data because of high dimensionality

balaji: very painful to ensure data is collected correctly/accurately

balaji: high level findings: was able to rank the programs based on "signal strength"

balaji: able to rank 547 programs based on their signal strength in predicting outcomes.

balaji: very concerned about overfitting (machine learning/statistics jargon, where predictions are only applicable to training data set)

balaji: Based on a single show alone, achieved 82% accuracy at the state level and 75% accuracy at the county level

balajir: data have been validated with facebook dta science report

The night before elections, the strongest state-based model would have predicted 8 out of 10 "swing states" accurately

balaji: built 547 models instead of one big model

balaji: Most predictive show: "the daily show w Jon Stewart"

balaji: If minutes per voter is low (<9.63), then it predicts Republican 18 of 21

If it is high, then one more split on percentage of fans: if over 2.57%, then it predicts Democrat, otherwise Republican

balaji: describes a predictive model that will work before the election based on swing states

balaji: built the model on "safe states", and then used those to predict swing states. Got 8 out of 10 correct.

balaji: Second show is Duck Dynasty -- predicts republican voters

balaji: Duck Dynasty > 21 minutes = Republican

balaji: because of few rows but thousands of columns, solved by building many, simpler models

balaji: "If you beat the data hard enough, it will confess to anything"

balaji: problem: by chance alone, you can find some models that are randomly going to do very well

balaji: Randomized outcome and built models to test whether model building method is leading to false conclusions

balaji: by chance alone, how many models would you get that are accurate. This is how to tell whether your model accuracy (results) are trustworthy.

balaji: Redid this analysis at the DMA level. New great show "Fox & Friends"

(DMA: Designated market area)

balaji: Fox & Friends got almost all of the close DMAs perfectly, with 1 mistake

balaji: election ad.spending an interesting case for multi-platform targeting and digital/web marketing convergence

balaji: can we build something that is cross platform and makes it easy for users to opt-in?

balaji: geo-targeting. we need to know somebody's homebase

balaji: geo targeting -- we need to know where someone's home base is where they are voting.

balaji: but want to advertise to those people wherever they are.

balaji: geo-history?

balaji: location identification needs to be precise at the state, county and DMA level for instance.

balaji: Personalized + context-sensitive advertising

balaji: marketers care about the homebase of a user, not their exact current localtion necessarily

balaji: want to advertise cross-device. tv to mobile

balaji: suppose you can predict they are likely to be democrat or republican based on what they watched on TV -- can we then reach them in a personalized way on other devices?

balaji: but want to do this in a way that is privacy friendly and puts the user in control

Q: could you improve the model by focusing on certain counties that are more important to the state level political outcomes?

A: we are looking at things like that now

Q: what modeling technique? A: Classification trees

Q: Have you made any comparison with Twitter data? And on the classification schemes, have you got any metrics which identify how much each variable contributes to the effect of the prediction?

A: Different models have their own scores for variable significance. In this case we didn't do it because we had only 2 variable.

balaji: A: first question on Twitter -- one extension we're doing right now is pulling in Twitter data for 2012 season. Personally I think it is useful, but it is one of the sources that has intent + manipulation online. It concerns me that there is a lot of intentional pings -- some noise in addition to the signal

balaji: e.g. robots promoting shows

wseltzer: Keiji, do you think this kind of political prediction is a pattern, or an anti-pattern?

keiji: can't really say

Satyam / Nielsen: Q: What about local optimization? A: If we make it too much, we get bound to that specific environment. e.g. in Japan, cell phone companies did avoid to use cookie in 1st gen web on cell phone; so they used caller ID instead -- sent single unique unchangeable ID

keiji: by using that tech, the service providers got used to using that unique ID, so they are not ready to switch to more volatile IDs like cookie IDs


chage: introduction to session2 from Chad Hage

chage: 3 presentations on Metrics and Data Collection

Session 2, Metrics and Data Collection

presenting, Jarrett Wold, Ad-ID

jwold: Ad-ID is a unique identifier, like a product code for advertising assets
... unique identifier leads to interop, reduce human error

AdID is a unique identifier for advertising assets, such as the creative

jwold: AdID is at the center of the transition to a digital ad slate, e.g. for a TV commercial.

jwold: the ad slate would include metadata about the asset, like what itunes does

jwold: embedded into files

jwold: Developed XMP Ad-ID schema with IAB and Adobe

jwold: Working with SMPTE on the explanation of what the schema represents, as a standard

jwold: Working with IAB on VAST 4.0

Ad-ID slides: https://www.w3.org/2015/digital-marketing-workshop/slides/Ad-ID-W3C.pptx

jwold: AdID getting good adoption on both broadcast TV and online video. Trying to push adoption in other areas like audio and internet display

jwold: Media Interoperability: register -> operationalize -> measure + report

jwold: All commercials produced for TV, radio + digital platforms that include SAG-AFTRA union members, must use Ad-ID

interesting strategy for pushing adoption of an up and coming standard

jwold: SMPTE / CIMM OpenID - detect identifier of what is airing, and then do with it what you like

jwold: uses watermarking, but identifier gets lost during compression. Working with SMPTE / CIMM to try to get the ID to survive compression

REZA: Browser Aware Data Collection

reza: Reza Jalili, I work at Adobe

Reza's slides: https://www.w3.org/2015/digital-marketing-workshop/slides/Adobe_w3c-browser-aware-data-collection.pdf

reza: collection done with in-page JS includes

reza: users have no control, browsers don't know

reza: problems: each library has its own name, semantics are different, endpoints are all different

reza: customer problem: data collection companies have trouble aggregating this data due to data being non-interoperable

reza: proposal: find out what is being collected, understand legal entities involved and privacy rules in place

reza: give control to the user

Chad Hage, Nielsen: https://www.w3.org/2015/digital-marketing-workshop/slides/measurable_by_design.pptx

chage: more and more programmatic ad delivery techniques are emerging, because iit's not being done in a consistent way

chage: How do I measure reach, since things are so tailored/personalized?

chage: how do we identify non-human traffic?

chage: Tomorrow is fast approaching. Non-human traffic will be an even bigger problem in the future, when we have so many more connected channels

chage: 10 billion devices on the web by 2020

chage: Need for measurable, 3rd party independent, reliable + consistent by design

chage: working groups that exist today, or that could be formed out of this, could get us to reliability by design

chage: Proposal: simplify the delivery of ads into content by extending HTML spec to include document elements that make it simpler on clients such as browsers to detect, identify, acquire, and render an ad

chage: Ensure that these specs address both human and legitimate non-human traffic

Q: Satya from Nielsen Catalina: I've been working with AdID for mobile. You are saying you have an ID to uniquely identify the creatives. Do you also distinguish whether it was designed to be shown on a particular type of device? And where was it shown?

satya: is the metadata extendable?

jwold: A: In our UI, it's up to the advertiser to tag the asset e.g. what type of media it is (tv billboard). We are just a registration authority.

jwold: we do not track where the ad gets displayed

jwold: extending the metadata: we have an intensive XML with many more fields than I showed on the digital slate. e.g. an ad trafficker can put in the ad start + end dates, who the talent is for that ad; something like 128 fields.

a: for the digital slate, we extended our XML for that particular specification. That's now an industry standard.

iab: Q for Reza: if we take the data away from people by putting more controls on it instead of the "tag based free-for-all" buffet we have today, how do we make that acceptable to the industry?

A (reza): Great -- that worked in the colonial town, but now as we want the industry to grow, we can improve quality of the data by adding controls e.g. a port keeps track of who is coming in and out, and standard ways of tracking the data, wouldn't that be interesting?

Q: Chad raised the issue of personalization. How will AdID respond to personalization issue?

Q: we are getting to the point where in principle every ad could be individualized.

A: Right now we don't do anytihng about personalization, the info is all about that particular asset

Q: In the interactive world, there could be multiple "assets" coming together for a particular interaction

A: If you have 5 jpg files or png files, we are tracking those particular assets, and it starts from the creative side.

reza; Steve's question touches on Chad's issue -- when every ad is just for you and the audience is audience-of-one, what are you measuring anyway?

Q: (from Mozilla) I'm working on some custom elements that report things, and it's early, but it seems to solve the problems you want solved; question is how this would get adopted by the ad industry

display ads have momentum -- even if the new tech existed, how does that get to market?

A: (chage): if we put out something that's better, I think adoption will be there. This ecosystem has not shown that it's rigid -- it is in fact overly dynamic

andrea: Q: Thinking about mobile apps, when you install on Android, it asks what data it can use. IN browsers they ask if we want to use geo location. Couldn't we do the same for certain other information e.g. name, email, etc.? They can have default settings for all sites, and can make exceptions for certain sites - it would only ask for the info it needs when you arrive at a paritcular page that needs certain data

chage: how will the experience work when we're on a big screen or some other situation? It is no longer just a browser

andrea: What is the benefit of using adid for web?

jwold: take an ad that starts at the creative, it gets shipped off for delivery, throughout the supply chain. With AdID you create the metadata within the system and get a unique ID -- that will live throughout the ecosystem. Eliminating human intervention and rekeying of the metadata.

satyam: in last year, there has been 5-fold increase in mobile consumption of TV content.

satyam: so I want to keep it in sync, so I can measure the ROI across all devices

jwold: we havea plugin for adobe bridge, that lets you plug the ad id into an asset. Shows the power of using our APIs.

brad: I like the idea of making reporting and measurement more first-class platform citizen. I have similar thigns in my talk. The hard part is the economics. Want to put users in control, but if the first thing they do is turn everything off including telemetry, google analytics, etc.

then people will go back to their old way of doing things. Interesting to explore economic models so that users will feel it is worthwhile to be included in the exchange of value in the platform.

reza: I agree. How far on that continuum do we want to go? We may want to get to the interoperability, and it could just work in the browser, but there may not be a switch for the user to control it.

brad: the user-agent is the user's agent. Let's start from that as a first principle.

Kaushik Dutta: Q from U So Korea faculty, but building mobile ad tech platform out of Singapore. A few points:

Didn't hear the term fraud so far, but need to be concerned about that. Have seen numbers that 50-98% fraud click. I've seen it happening on the click, where you are going to places where agents are devices

On data collection, attribution; need to think about this

After the advertisement, how do we collect the data so we can show value to campaign managers + brands

Other point is where the ad was delivered *is* important, because of attention span, size

Need to think of the advertisement as a program, interacting with another platform over HTTP protocol. How do we deliver the ad, measure it, avoid fraud, etc. -- service oriented architecture in enterprise platform is good model for this we can learn from.

Mark_Torrance: CTO at Rocket Fuel
... the more metadata the better,the more descriptive, the better
... I'd love if ads had data about people, color, length, relation info
... e.g., this is the 15 second version, this is the longer version
... standard taxonomies
... re Standardization, worried that the more commonality we introduce
... the more the ads look the same, will users block it all?
... I'd like to take the position that the user agent is the user's agent, but history of marketing is about surprising people
... we don't have standards for billboard shape
... tension between user experience goals and goals of marketers

Brad: user agent as user's agent was a statmeent of fact



Have you guys been thinking about annotating ads with the purpose of the campaign or what is the purpose the data is used for?

What about models where the behavioral data never leaves the browser, but still some targeting + measurement can be applied?

jwold: we've thought about that, for that metadata and what we are specializing in, it's a fine line in terms of what makes sense for that creative asset.

chage: there's a layer of data that gets added on top that explains what the purpose/intent of the advertising is. Today, we're looking at it like "this is the introduction of the brand" -- we infer a lot of things that we can deduce, but would love that to come into the actual stream and have it be more factual rather than deduction

david: Q: Please explain more about opportunities HTML enhancements can have for the browser, e.g. AdFrame?

chage: I'd love to look at any frame that is actually consistent and reliable + repeatable -- could produce measurement off of it

... the way the existing tech stack is being used is so different -- even the same user does not get the same iframe every time. I want the consistency that specs have brought to other parts of the HTML paradigm.

ash from whiteops: Before we solve viewability, we have to be sure it is a human at the other end of the wire.

... there are lots of other efforts; organizations like TAG are barreling forward aggressively with standards they are pushing. TAG is about to charge everyone $10k to identify themselves, and it's not a great standard.

... any bad guy who wants to pay $10k is now in the whitelist

<BrendanIAB> Point of correction on the TAG registration:

greg: one thing I heard is that if users turned off the ads they would. Companies could have followed do not track. Do advertisers actually want to give users choices? How can we create a situation -- if you are not viewing the ads we're serving then you won't get to see the rest of the site.

<BrendanIAB> 10k is the proposed price for validated registration - companies that are checked to be sure that they exist.

reza: there's policy and philosophy. We want to have a standards way for companies that want to be compliant so they can do that. We hope to come up with standards that allow flexible implementation on top of that.

<BrendanIAB> The working group (public for anyone to join) is considering a very low fee for non-validated registration,

<BrendanIAB> And validation is independent of certification (knowing whether a company is adhering to guidance on anti-fraud, anti-malware, anti-piracy efforts)

Brad_Lawrence: question for everyone here. How many people run ad blocking software on their own browser?

a: about 15%

brad: I run it to avoid 0-day exploits, and data consumption on the mobile space. That's what hurts the consumer. There was a small group of people very motivated by philosophical reasons for DNT,

... when I look at my extended family who are tech savvy, they are not mostly wanting to DNT

... tech people are telling our friends to use ad blocking, because we want them to avoid malware.

GregN: if users don't have a granular choice, like "I'm willing to accept ads but only if they don't violate my privacy", then they will just go for a coarse choice like ad blocking.

brad: In TV advertising, people worried TiVo would cripple advertising, but the opposite happened because it is mostly a passive medium

olivier (Firefox): I'm glad to hear us talking about this. People don't mind ads, and sometimes find them useful. Adwords is incredibly useful for many people.

... The problem is not that they are ads, the problem is they don't know why they are seeing the ads, sometimes they are obtrusive, they didn't have a say, and don't know why they are being forced to see it.

... the fact we can target + involve the user in communication is useful -- a male user could go to a website that sells clothes -- they could volunteer they are Male that could help the website know that so they can tailor the content.

reza: note that adblockers can have the business model of knowing lots of data about users and selling it.

chage: make the experiences delightful and users will want to come back. The economies of marketing are being discussed; what's been left behind before this conference is the technical aspect of it.

Brad Weltman IAB: comments.

... "we always assume the gears will just work" -- the inverse is true in this room. You can't just take a technical solution and layer it on top -- we have to keep consumers and economics in mind too. Consumers want choice, and want to be involved, but they have to be not too annoyed. They won't want choice at every junction.



Session 3: Security and Viewability

Security and Viewability

cclark: Ash: fraud, Brad: malvertising, Olivier + Brendan: human security, Mark: viewability, one more

Ash Kalb: Fraud

Ash: We are secuity company as well as measurment company.

cclark: introduction to security and viewability session

Ash Kalb slides: http://www.w3.org/2015/digital-marketing-workshop/slides/WO-W3C-ABK-20150917.pptx

Ash: advertising is about buying a slice of human attention

Ash: botnets are resident on computer

Ash: a clone of you. infinite number of scams

Bot nets have been used for raud, bot net, click fraud are being used to monetize.

Ash: Bot nets have been used for raud, bot net, click fraud are being used to monetize.

Ash: 5b USD annualy are attributed to fffraud related to advertising

Ash: scam is: paid as a publisher for running ads

Ash: More sophisticated frauds are being introduce and generate traffic.

Ash: ad-fraud is the best way to monetize malware

Ash: This kind fraud motivate attacker to attack to people's computer.

Brad Hill's slides (group deck); http://www.w3.org/2015/digital-marketing-workshop/slides/security-viewability-digital-marketing.pptx

Brad:I am new to this area.

Brad: But is is interesting area. It has very complex structure.

Brad: there are problems in the ad industry that could make it unravel

Brad: web based on trust. malvertising is catastrophic for this model

Brad: Malvertising collapses the boundaries between good and bad.

Brad: Malvertising is making ad blocking as essential as anti-viruses

Brad: publishers have to trust advertisers and ad networks

Brad: they don't really trust them, but most don't have enough power to demand better security

Brad: Advertisers and ad networks don't trust publishers

Brad: we need to improve the platform, less trust, more guarantees

Brad: if you can't sandbox it, you must be able to analyze it, if you can't analyze it, you must be able to sandbox it

Brad: If you can't asnbox it you must analyze it.

Brad: some approaches: ad "stitching". inline ads with publisher content on the server-side

Brad: Ad "Stitching" happens today many places.

Brad: this is completely nuts from a security standpoint

Brad: when is stiching OK? image/video + text. simple model, no script, no flash, no xhr, no cookies

Brad: It's not sandboxed at all you have to analyze it.

Brad: have to trust facebook that the ad has been seen

Brad: another angle is iframes and sandboxing. still had lots of hurdles to make it work with what people have wanted, e.g. working with plugins

Brad: maybe it's time to revisit iframe sandbox

Brad: iframes and sanboxing. Strong isolation; enforcing what content is hown and where links to go is still difficult.

Brad: Few have wanted to use it.

Brad: hybrids: analysis + sandboxing

Brad: no standard yet

Brad: propose an approach ad nework hybrids (Analysis + Sandboxing together)

Brad: still hard to do independent measurements

Brad: Cam WebAppSec WG help?

Brad: we work on Iron frame

Brad: should independent measurement and audit be a first-class citizen in the web platform?

Brad: scripts in a membrane? like a chrome extension?

Brad: Declarative reporting like CSP.

Olivier: Let’s encrypt

Olivier: https is a good thing to use

Olivier's slides: https://www.w3.org/2015/digital-marketing-workshop/slides/oyiptong_letsencrypt.pdf

Olivier: talks about encryption https

Olivier: Privacy matter

Olivier: firesheep example that harvested cookies

Olivier: Public commnication Firesheep is a tool to capture cookies sent through clea text and to hijack sessions.

Olivier: cases Google, AT&T

Olivier: Verizon: Perma-Cookie, Verizon-ID can link cookies used in past.

Olivier: XFINITTY Wifi inject javascript on web contents watched by user.

Olivier: in China there was injection of javascript to Baidu user used for DDoS.

Olivier: China there was DNS chashe poisoning.

Olivier: HTTPS isn't perfect, but it's better than HTTP

livier: HTTPs is better on Encryption, Data integrity, authentication than HTTP.

Olivier: Mozilla - new features are not accessible to HTTP only HTTPS

Olivier: htts is the way to go forward to avoid security issues

Brendan: Human security, talks two topics.

Brendan: We needs secure communication with https.

Brendan's slides are also in http://www.w3.org/2015/digital-marketing-workshop/slides/security-viewability-digital-marketing.pptx

Brendan: describing the advertising industry tree

Brendan: Advertise industry has a tree structure.

Brendan: If you talk in secure channel you would have less people.

Brendan: HTTPS is on the way to go.

Brendan: Reducing the 3rd parties to use for advertising reduces the opportunities for snooping

Brendan: There was some resitance.

Brendan: IABtech lab is developing an ad tech https implementer's guide

Brendan: Server Side ad insertion, insertion in the middle of commnication is bad.

Brendan: audio/video ad.insertion makes sense

Brendan: We are developing function to accept ad from different places then integrate them.

Brendan: Building trust with an increase number of 3rd parties is expensive

Brendan: Operators have small number of ad networks to trust.

Brendan: If I profile sites I may be able to make more value on ad.

Brendan: server ad-insertion reduces transparency for the end-users

Brendan: IAB and tech lab works to solve such issues.

marktorrance_: going to talk about viewability

Mark: Rocket Fuel tries to show ads to the right users based on machine learning

marktorrance_: customers look for proxies of what they are looking for

Mark: In Direct response campaigns marketers can figure out ROI based on the evaluation of sales

marktorrance_: looking to avoid waste: fraud and to avoid when an ad hasn't been seen

marktorrance_: Group M has come out for its own standard for video

<BrendanIAB> MRC developed and issued the Videwability standard, IAB was a participant in the development.

marktorrance_: We need to manage different standards different ads now.

marktorrance_: with all those different standards for viewability, it's important to be able to measure delivery, by publisher and by third-party audits

marktorrance_: there are a of of trackers, and they all have reasons to be there. they want partners to be audited

marktorrance_: non-viewable impressions are lumped together with botnet fraud

marktorrance_: challenge for rocketfuel is that they had to bid to even display the ad

marktorrance_: sold at the time the page is rendered, not when the page is scrolled down

marktorrance_: asks:< change it on the publisher side, to only bid when the ad is into view

marktorrance_: would decrease the time to load the page. let's not sell the ad if it's not viewable

marktorrance_: hard to know if an ad is viewable because of nested iframes. hard to know where it has been served or the geometry

marktorrance_: ironframe tries to do this maybe

marktorrance_: want the ability to answer: what is the site i'm on, what is the chain of sites from the site to me?

marktorrance_: ancestor origin in chrome is good

marktorrance_: We have nestaed iframe. We need to method to detect bad publisher inside.

marktorrance_: another is the geometry

marktorrance_: We have a way to present limited range of data can be shared with advertiser.

dan: we're not the bad guys

dan: working on ironframe

Dan: We are here in Tampa and not on a yacht, we are not bad guy.

Dan: I have broken most of browser in various ways.

Dan: is woking to develop iron frame.

dan: separate viewability in two things: natural viewability (below the fold) and absolute viewability

dan: another name for the viewability problem is clickjacking

Dan: Viewability issue has another problem on click jacking.

dan: retweet shows a popup because the only way to ensure authenticity has been a popup

Brendan: Attribution modeling is difficut to do right.

Brendan: To measure ROI we need right Attribution modeling.

Ash: For bot net it is easy to make fake click.

???: HTTPS I have developed on of biggest DSP, as engineer using https for large traffic is night mare.

Brendan: Is challenge now as the browser improve it would become possible.

Brad: HTTP2 is great if you use it is simply model but does not work in your model.

oyiptong: https://istlsfastyet.com/

Dan: Huge night mare is most people can not properly configure crypt stuff.

Satya: How do you attribute if device change their user?

marktorrance_: matching user-households to panel data ~97% accurate on Rocket Fuel data

chrisclark: how important is functionality of script/display as compared to data collection?

BrendanIAB: it's important to reach media, e.g. video player
... market differentiation, features around rich media interaction
... VPAID, when an ad pops over the video, the video pauses
... so you can order your movie tickets and go back to the trailer

marktorrance_: other than sending metrics back, another reason for scripting is bot determination
... that's not going to be solved by taking away ability to run programs

BradHill: I'm not for taking away JS; what does the right sandbox or isolation mechanism look like?

dankaminsky: flash actually has a well-developed sandbox

BrendanIAB: IAB guidance on rich media no longer includes flash

BradHill: mobile, chrome, FF making it click-to-play, turning it off during the last zero day

dankaminsky: Google is doing dynamic translation from fflash to HTML5

BrendanIAB: Adobe's publishing tools export to HTML5

Andre: Coming from Brazil, most things in US seem 5 years ahead, but not banking
... yet Brazilians are afraid to buy online, too much fraud

Andre: Hackers in Brazil are very creative. People afraid using online services.

marktorrance_: lots of drop-off in the measurement in between various parties
... if there were a way everyone could trust one company's measurement, big step forward

marktorrance_: 3rd party measurment is important to monitor how much impression or click are made.

Ash: if everyone can agree on the metric "how many humans viewed the ad"

dankaminsky: every time you have metrics, you have people gaming the metrics
... including whether we're watching
... 3% to 40% difference between when we said we were watching "August" and before/after
... most interesting to me, the server-side stitching attention

bradhill: often the server-side ad-frame, but thats just as bad

dankaminsky: should we do signed content blob from foreign origins
... signed blobs for everything?

Bradhill: subresource integrity, live in Chrome and FF
... from W3C WebAppSec
... specify a hash for a script; throw away scripts that don't match
... you still have to analyze the script at some point

marktorrance_: Google analytics wouldn't be able to upgrade their code

bradhill: and it doesn't help with the phishing of an otherwise trusted party
... still need layers of isolation to guarantee security invariants

oyiptong: that was my issue with SRI -- I don't see how much it buys

bradhill: it lets you put things on CDNs and verify that you're getting that back
... compromising the jquery CDN doesn't mean the entire internet gets owned

dankaminsky: does SRI stop mixed content warnings?

bradhill: no
... we have only one bit of info conveying privacy and security, can't yet decompose that

Dutta_Kaushik: old devices?

dankaminsky: Performance of lots of ad code is poor
... we try testing on racks of devices
... this problem needs to be tested on real hardware

chadhage: how do we differentiate between good bots and bad bots

dankaminsky: IAB bots and spiders lists
... lists good bots

BrendanIAB: that list does not address fraudulent bots (unless they're really dumb)

chadhage: that's a big problem for the passive measurement

marktorrance_: HTTPS vs captive portals?

Data Modeling and Context

satya: data modeling, reaching audiences
... I'm with Nielsen Catalina Solutions, a joint project between Nielsen and Catalina (coupons)
... what you buy and what you watch
... I'm going to pose a question, how to do attribution properly

Alexandre: Structured Data for Marketing

Alexandre: My background is data on the web. Have worked for W3C.

betehess's slides: http://www.w3.org/2015/digital-marketing-workshop/slides/structured-data-for-marketing.pdf

betehess: schemas for producing data -> your website -> data consumers
... social sites (pinterest, twitter), search, other sites, local search, gmail
... there are so many formats and schemas
... schema.org, Open Graph, Twitter, Pinterest says "we support schema.org" but doesn't say which syntax
... you want to make sure it works
... but the markup is very easy to break
... webdevs don't know what they're manipulating when it comes to data
... overlaps among ontologies

betehess: There are so many overlaps in Ontrology

betehess: so I have to define the name 3 times to make Google, Facebook, etc. read it right
... Support.
... it's hard to discover who supports what; hard to scale
... different subsets of what's supported from schema.org
... I want caniuse.com for data schemes
... and structured data: all the parts of the ontology, known readers of the data, what's supported
... I think W3C is the right place to STANDARDIZE ALL THE THINGS

Eric_Kauz: GS1

Eric's slides: http://www.w3.org/2015/digital-marketing-workshop/slides/GS1_Context_Panel.ppt

scribe: We're working with W3C in a few areas
... We're best known for bar codes
... product ID, coupon ID, party ID

Eric: GS1 is standardization organization global supply and demand chains.

scribe: identify, capture, share
... relevant areas: id standard for digital coupon management; GTIN product identifier
... digital coupon standards, product data model
... My background is as data architect

Eric: We work on identification standard for digital coupon, product ID GTIN +s, Digital coupon standard, product data model, web vocabulary - schema.org.

scribe: semantic markup of digital coupons; targeted promos based on digital receipts
... interop between paper and digital coupons
... Web vocabulary to extend schema.org

Eric: vocabulary includes food and beverage produc information

scribe: about 300 new attributes for schema.org products

Eric: more work planned to include properties from Digital Coupon Management Standard into GS1 web vocab

Eric: Our community is mostly from product vendors.

Sungkwan Jung slides: http://www.w3.org/2015/digital-marketing-workshop/slides/PositionPaper_DigitalSignage.pdf

Sungkwan: talks on onfomation meta-data for digital signage.

s/onformation/on information/

Sungkwan: KAIST is an educational insititution now we are developing data model for digital signature as government funded project.

Sungkwan: Dart Media is a system for digital signage.

Sungkwan: Ad delivery system provide bidding and auction function for advertisement digital signage device have multiple sensors.

Sungkwan: sensors include camera, proximity, co2, temperatures, humiditiy, etc.

Satya: title: using purchase data to inform digital advertising

Satya: We are working to include user behabior on top of demographic analysis.

Satya: demographics-based marketing misses sales opportunity.

Satya: what's the use of advertising baby products to someone without kids? TV was broadcast, but mobile device marketing can be better targeted

Satya: Intersection of Brand Volume and Demo Target is only 47%.

Satya: How to link consumers buying to what they watch.

Satya: We marge watch data and actual buying data. We have single source digital HH.

Satya: Catalina marketing provides data.

Satya: 8.7 Billion impressions of video contents in non-liner video. 5time increase.

Satya: As we enter the post-cookie world, people moving from desktop to mobile, from browser to apps

Satya: How to track without being invade people's privacy is a key for this group.

Satya: everyone has their own ways of tracking

Satya: We are entering the post-cookie world.

Satya: Who should get the credit.

Satya: Mobile Ad IDs (MAID) changes with the device, every 18 months or so

???: How do you link those different data source?

Satya: We have third party to match the data to track user's without having actual personal data.

Satya: We have yahoo ID linked to actual consumer without having actual personal data.

Marktorrance: IAB tried to do audience data standard, but it was too big an area, not enough focus
... then narrower focus on ID syncing, but there was already an installed base

???: How about using hash of e-mail address ad an identifier.

marktorrance: the user needs to be able to clear the identifier,

aaa: Consumer should have chance to optout.

marktorrance: like clearing cookies, minimum bar for consumer choice

Andre: On scheme XML was expected to become effective. How do you think how is it going to be in 5 years.

betehess: we would have more data and we hope it becomes simpler but there is not good way at this moment.

Satya: Q to Sungkwan. Is it possible to present coupon on digital signage and link it to user devices?

Sungkwan: We have had experiment that provide coupon based on sonsor data then provide coupons.

sungkwan: We also made a api to provide sensor data to web browser.

betehess: the ID is not yet a first-class web citizen, you can't link to it
... I'd like to see more use of the Web architecture

marktorrance: The Minority Report device made me think again of the "creepy" question
... we don't have chief creepiness officer, social scientists to help us think more about consumer attitudes
... also, consumer attitudes change
... would people find it creepy to get a shoe ad on their phones after browsing on the desktop?

BrendanIAB: the literature of the 80s, direct mailing industry, was thinking about these issues
... so I don't believe the Target-knows-you're-pregnant story because that industry knows to add noise

oyiptong: uncanny valley

gregn: If company wants to walk up to the line of what's legal, it's often violating social norms, even if it's legal
... pushing the bounds of tech
... If I were to follow you around writing down your purchaes, you'd find that creepy whether or not legal

BradL: you need to look across demographics too
... and experience, difference between public search and private email

BradH: some companies draw the line internally, e.g. don't use your adult browsing to target ads

BradW: ultimately, the chief creepy officer is the user
... companies learn quickly when they've gone too far

reza: Data modeling is a rich area, we need to go deeper
... IDs are only one piece; how do you connect schemas, semantics?
... catalog what exists, what could be improved, what could be done at W3C?

wseltzer: it sounds as though there's a confluence of shared interest for product data

reza: and continue on to digital signage

sungkwan: digital signage is often in public places, people don't mind sensors

keiji: in Japan, vending machines have video cameras, make recommendations to passers-by based on the demographic

dezell: we tried to do that at gas pumps, but many state laws against profiling

BradL: local company, TruMedia, pretty good at recognizing people 30' out, and changing the ads to match

<Jinhong> TruMedia is 'TruMedia'

Satya's slides, http://www.w3.org/2015/digital-marketing-workshop/slides/NC_W3C_Purchase_Behavior.pptx

Summary of Action Items

[End of minutes]

Minutes formatted by David Booth's scribe.perl version 1.140 (CVS log)
$Date: 2015/10/05 14:49:41 $

WARNING: No "Present: ... " found! Possibly Present: Aaron_Nightingale Alexandre Andre Ash AshKalb BillScannell BradH BradL BradW Brad_Lawrence Brand Brendan BrendanIAB Dutta_Kaushik Eric Eric_Kauz GregN Jinhong Juan_Carlos_Garcia Mark Mark_Torrance Marktorrance Olivier REZA Saravana SteveZ aaa andrea balaji balajir betehess bhill2 brad brad_at_Trunica bradhill cclark chadhage chage chriscla chrisclark dan dankaminsky david dezell gnorcie greg iab inserted jwold kaz_ho keiji ktageda ktakeda livier marktorrance_ oyiptong saravana_ satya satyam sel skjung sungkwan takeda tmichalareas wseltzer wseltzer_irc You can indicate people for the Present list like this: <dbooth> Present: dbooth jonathan mary <dbooth> Present+ amy Agenda: http://www.w3.org/2015/digital-marketing-workshop/agenda.html Got date from IRC log name: 17 Sep 2015 Guessing minutes URL: http://www.w3.org/2015/09/17-digimarketing-minutes.html People with action items: [End of scribe.perl diagnostic output]