Skip to content

Commit

Permalink
Merge pull request #2 from private-attribution/histogram
Browse files Browse the repository at this point in the history
Histogram explanation
  • Loading branch information
martinthomson authored Sep 12, 2024
2 parents 8adc624 + d6fa2df commit 7dc8bb0
Show file tree
Hide file tree
Showing 3 changed files with 233 additions and 21 deletions.
155 changes: 134 additions & 21 deletions api.bs
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,57 @@ that enables the collection of aggregated, differentially-private metrics.
The primary goal of this API is to enable attribution for advertising.


## Attribution ## {#s-attribution}

<dfn lt=attribution|attributed>Attribution</dfn> is the process of identifying [=actions=]
that precede an [=outcome=] of interest,
and allocating value to those [=actions=].

For advertising, <dfn>actions</dfn> that are of interest
are primarily the showing of advertisements
(also referred to as <dfn>impressions</dfn>).
Other actions include ad clicks (or other interactions)
and opportunities to show ads that were not taken.

Desired <dfn>outcomes</dfn> for advertising are more diverse,
as they include any result that an advertiser seeks to improve
through the showing of ads.
A desirable outcome might also be referred to as a <dfn>conversion</dfn>,
which refers to "converting" a potential customer
into a customer.
What counts as a conversion could include
sales, subscriptions, page visits, and enquiries.

For this API, [=actions=] and [=outcomes=] are both
events: things that happen once.
What is unique about attribution for advertising
is that these events might not occur on the same [=site=].
Advertisements are most often shown on sites
other than the advertiser's site.

The primary challenge with attribution is in maintaining privacy.
Attribution involves connecting activity on different sites.
The goal of attribution is to find an impression
that was shown to the same person before the conversion occurred.

If attribution information were directly revealed,
it would enable unwanted
[[PRIVACY-PRINCIPLES#dfn-cross-context-recognition|cross-context recognition]],
thereby enabling [[UNSANCTIONED-TRACKING|tracking]].

This document avoids cross context recognition by ensuring that
attribution information is aggregated using an [=aggregation service=].
The aggregation service is trusted to compute an aggregate
without revealing the values that each person contributes to that aggregate.

Strict limits are placed on the amount of information that each browser instance
contributes to the aggregates for a given site.
Differential privacy is used to provide additional privacy protection for each contribution.

Details of aggregation service operation is included in [[#aggregation]].
The differential privacy design used is outlined in [[#dp]].


## Background ## {#background}

From the early days of the Web,
Expand All @@ -35,7 +86,7 @@ was the ability to obtain information about the effectiveness of advertising cam

Web advertisers were able to measure key metrics like reach (how many people saw an ad),
frequency (how often each person saw an ad),
and conversions (how many people saw the ad then later took the action that the ad was supposed to motivate).
and [=conversions=] (how many people saw the ad then later took the action that the ad was supposed to motivate).
In comparison, these measurements were far more timely and accurate than for any other medium.

The cost of measurement performance was privacy.
Expand Down Expand Up @@ -96,7 +147,50 @@ New additions to the

## Attribution Using Histograms ## {#histograms}

TODO explain why we use histograms
[=Attribution=] attempts to measure correlation
between one or more ad placements ([=impressions=])
and the [=outcomes=] that an advertiser desires.

When considered in the aggregate,
information about individuals is not useful.
Actions and outcomes need to be grouped.

The simplest form of attribution splits impressions into a number of groupings
according to the attributes of the advertisement
and counts the number of conversions.
Groupings might be formed from attributes such as
where the ad is shown,
what was shown (the "creative"),
when the ad was shown,
or to whom.

These groupings
and the tallies of conversions attributed to each
form a histogram.
Each bucket of the histogram counts the conversions
for a group of ads.

<figure>
<pre class=include-raw>
path:images/histogram.svg
</pre>
<figcaption>Sample histogram for conversion counts,
grouped by the site where the impressions were shown</figcaption>
</figure>

Different groupings might be used for different purposes.
For instance, grouping by creative (the content of an ad)
might be used to learn which creative works best.

Adding a value greater than one at each conversion
enables more than simple counts.
Histograms can also aggregate values,
which might be used to differentiate between different outcomes.
A higher value might be used for larger purchases
or any outcome that is more highly-valued.
A conversion value might also be split between multiple impressions
to split credit,
though this capability is not presently supported in the API.

* Compatibility with privacy-preserving aggregation systems
* Flexibility to assign buckets
Expand All @@ -109,36 +203,49 @@ TODO explain why we use histograms
The private attribution API provides aggregate information about the
association between two classes of events: [=impressions=] and [=conversions=].

An <dfn>impression</dfn> is the
event to which [=conversion=]s are being attributed. Selection of impression
events is left to the consumer of the API. Examples include:
An [=impression=] is any action that an advertiser takes on any website.
The API does not constrain what can be recorded as an impression.
Typical actions that an advertiser might seek to measure include:

* Displaying an advertisement to a user.
* Viewing a particular web page.
* Displaying an advertisement.
* Having a user interact with an advertisement in some way.
* Not displaying an advertisement (especially for controlled experiments that seek to confirm whether an advertising campaign is effective).

A <dfn>conversion</dfn> is the
event being attributed to [=impression=]s. Selection of conversion events
is again left to the consumer of the API. Examples include:
For the API, a [=conversion=] is an [=outcome=] that is being measured.
The API does not constrain what might be considered to be an outcome.
Typical outcomes that advertisers might seek to measure include:

* Signing up for an account.
* Making a purchase.
* Signing up for an account.
* Visiting a webpage.

When an [=impression=] occurs, information about the impression is saved by the
browser. This includes an identifier for the impression
and some metadata about the impression, such as whether the impression was an
ad view or an ad click.
When an [=impression=] occurs,
the <a method for=PrivateAttribution>saveImpression()</a> method can be used
to request that the browser save information.
This includes an identifier for the impression
and some additional information about the impression.
For instance, advertisers might use additional information
to record whether the impression was an ad view or an ad click.

At [=conversion=] time, a [=conversion report=] is created.
A <dfn>conversion report</dfn> is an encrypted histogram contribution
that includes information from any [=impressions=] that the browser previously stored.

At [=conversion=] time, information for aggregation is created based on the
impressions that were previously stored. A site can request that the browser
select impressions based on a simple query.
The <a method for=PrivateAttribution>measureConversion</a> method accepts a simple query that is used
to tell the browser how to construct a [=conversion report=].
That includes a simple query that selects from the [=impressions=]
that the browser has stored,
a value to attribute to the selected impression(s),
and other information needed to construct the [=conversion report=].

* If there was no matching impression,
The histogram created by the [=conversion report=] is constructed as follows:

* If the query found no impressions,
or the [=privacy budget=] for the site is exhausted,
a histogram consisting entirely of zeros (0) is constructed.

* If a matching impression is found,
the specified value is added to a histogram
the provided value is added to a histogram
at the bucket that was specified at the time of the impression.
All other buckets are set to zero.

Expand Down Expand Up @@ -190,7 +297,9 @@ dictionary PrivateAttributionAggregationSystem {
};
</xmp>

## SaveImpression API ## {#save-impression-api}
## Saving Impressions ## {#save-impression-api}

The <dfn method for=PrivateAttribution>saveImpression()</dfn> method does something or other.

<pre>
navigator.privateAttribution.saveImpression({
Expand Down Expand Up @@ -231,9 +340,12 @@ Implicit saveImpression API inputs:

## MeasureConversion API ## {#measure-conversion-api}

The <dfn method for=PrivateAttribution>measureConversion()</dfn> method is used to do stuff.

TODO:
* Change filter data

<pre>
navigator.privateAttribution.measureConversion({
// name of the aggregation system
aggregator: "aggregator.example",
Expand All @@ -251,6 +363,7 @@ navigator.privateAttribution.measureConversion({
// a list of sites where impressions might have been registered
source: ["publisher.example"]
});
</pre>

// TODO clarify "Infinity"

Expand Down
31 changes: 31 additions & 0 deletions images/histogram.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
68 changes: 68 additions & 0 deletions images/value.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 7dc8bb0

Please sign in to comment.