Re: ISSUE-25: text to be discussed in today's call

I plan to ask some questions about this text, time permitting.

First, on the requirement that data "Must be pseudonymized before
statistical analysis begins, such that unique key-coded data are used to
distinguish one individual from another without identifying them".
Questions about this:

(1) What does "identifying" mean in this text?   (One might read "without
identifying" as requiring that data be "de-identified" according to the
definition that appears elsewhere in the spec.   But if the data qualifies
as de-identified then no permitted use is required here because the general
safe harbor for de-identified data already applies.   Alternatively, if
"identifying" means something different here, then that should be spelled
out.)

(2) What does "unique key-coded data" mean?  Is the text about "unique
key-coded data ..." meant to serve as a definition of "pseudonymized"?   If
so, it seems overly prescriptive, requiring one particular method that
(purportedly) qualifies as pseudonymized.    Alternatively, this text might
be read as requiring a particular (purported) pseudonymization method.   If
so, why require this particular method?

(3) Why allow pseudonymization to be delayed until "statistical analysis
begins"?  Why not require pseudonymization to be done promptly when data is
initially collected?


Second, regarding the "independent certification process under the
oversight of a generally-accepted market research industry organization
that maintains a web platform providing user information about audience
measurement research.   This web platform lists the parties eligible to
collect information under DNT standards and the audience measurement
research permitted use  ..."

(1) The authors appear to have a specific organization in mind.  Which
organization is that, and who runs it?

(2) What is the rationale for giving a particular organization control over
the the certification process and the ability to declare who is eligible to
exercise this permitted use?



On Wed, Jul 17, 2013 at 9:46 AM, Peter Swire <peter@peterswire.net> wrote:

>  Good morning:
>
>
> So that we're sure to be on the same page for this, here is the normative
> and non-normative text on audience measurement for today's call.  Edits in
> red in light of recent discussions that Kathy Joe and her group have had
> with a number of WG members.
>
>
> Thanks,
>
>
> Peter
>
>
> ==
>
>
> Issue 25: Aggregated data collection and use for audience measurement
> research 4 July 2013
>
>
> Normative:
>
> Information may be collected, retained and used by a third party for
> audience measurement research
>
> where the information is used to calibrate, validate or calculate through
> data collected from opted-in
>
> panels, which in part contains information collected across sites and over
> time from user agents.
>
> A third party eligible for an audience measurement research permitted use MUST
> adhere to the
>
> following restrictions. The data collected by the third party:
>
> • Must be pseudonymized before statistical analysis begins, such that
> unique key-coded data are
>
> used to distinguish one individual from another without identifying them,
> and
>
> • Must not be shared with any other party unless the data are
> de-identified prior to sharing, and
>
> • Must be deleted or de-identified as early as possible after the purpose
> of collection is met and in
>
> no case shall such retention, prior to de-identification, exceed 53 weeks
> and
>
> • Must not be used for any other independent purpose including changing
> an individual’s user
>
> experience or building a profile for ad targeting purposes.
>
> • In addition, the third party must be subject to an independent
> certification process under the
>
> oversight of a generally-accepted market research industry organization
> that maintains a web
>
> platform providing user information about audience measurement research.
> This web platform lists
>
> the parties eligible to collect information under DNT standards and the
> audience measurement
>
> research permitted use and it provides users with an opportunity to
> exclude their data contribution.
>
>
> Non-normative: collection and use for audience measurement research
>
> Audience measurement research creates statistical measures of the reach in
> relation to the total
>
> online population, and frequency of exposure of the content to the online
> audience, including paid
>
> components of web pages.
>
> Audience measurement research for DNT purposes originates with opt-in
> panel output that is
>
> calibrated by counting actual hits on tagged content on websites. The
> panel output is re-adjusted
>
> using data collected from a broader online audience in order to ensure
> data produced from the panel
>
> accurately represents the whole online audience.
>
> This online data is collected on a first party and third party basis. This
> collection tracks the content
>
> accessed by a device rather than involving the collection of a user’s
> browser history. Audience
>
> measurement is centered around specific content, not around a user.
>
> The collected data is retained for a given period for purposes of sample
> quality control, and
>
> auditing. During this retention period contractual measures must be in
> place to limit access to, and
>
> protect the data, as well as restrict the data from other uses. This
> retention period is set by auditing
>
> bodies, after which the data must be de-identified.
>
> The purposes of audience measurement research must be limited to:
>
> · Facilitating online media valuation, planning and buying via accurate
> and reliable audience
>
> measurement.
>
> · Optimizing content and placement on an individual site.
>
> The term “audience measurement research” does not include sales,
> promotional, or marketing
>
> activities directed at a specific computer or device. Audience measurement
> data must be reported as
>
> aggregated information such that no recipient is able to build commercial
> profiles about particular
>
> individuals or devices.
>
>
> Prof. Peter P. Swire
> C. William O'Neill Professor of Law
> Ohio State University
> 240.994.4142
> www.peterswire.net
>
> Beginning August 2013:
> Nancy J. and Lawrence P. Huang Professor
> Law and Ethics Program
> Scheller College of Business
> Georgia Institute of Technology
>
>
> From: Kathy Joe <kathy@esomar.org>
> Date: Wednesday, July 17, 2013 9:24 AM
> To: "public-tracking@w3.org" <public-tracking@w3.org>
> Subject: ISSUE-25 re 5.2 Audience measurement: ACTION 415 June change
> proposal:
> Resent-From: <public-tracking@w3.org>
> Resent-Date: Wednesday, July 17, 2013 9:25 AM
>
>  Dear All,
>
> Over the last few weeks as agreed by the group, we have had several calls
> including Susan Israel, Richard Weaver, Adam Philips as well as Peter Swire
> with Rigo, Justin and Jeff - the wiki
> http://www.w3.org/wiki/Privacy/TPWG/Change_Proposal_Audience_Measurement is
> not yet updated to cover these exchanges including
>
>    - The mail from Rigo withdrawing his suggestion following my note to
>    the group with a clarification
>    - The most recent submission following our call with Justin with
>    additional wording on 'pseudonyized' . It also includes clarification on
>    the purpose of AMR (see attached, text in red is new text not in the wiki
>    version)
>    - A note to Jeff Chester clarifying part of the non normative text
>
> I attach the email string below in advance of our call later today
>
> Best regards
>
> Kathy Joe
>
>
>


-- 
Edward W. Felten
Professor of Computer Science and Public Affairs
Director, Center for Information Technology Policy
Princeton University
609-258-5906           http://www.cs.princeton.edu/~felten

Received on Wednesday, 17 July 2013 15:43:06 UTC