From 39bda30e7b99227fc828d4e3e8d9c30805ddb1e4 Mon Sep 17 00:00:00 2001 From: Benjamin Case Date: Wed, 2 Oct 2024 15:32:27 -0400 Subject: [PATCH 1/6] update algorithm for budget deduction --- api.bs | 38 +++++++++++++++++++++++++++++++++++--- 1 file changed, 35 insertions(+), 3 deletions(-) diff --git a/api.bs b/api.bs index 7a246be..ffe07ad 100644 --- a/api.bs +++ b/api.bs @@ -6,6 +6,7 @@ URL: https://private-attribution.github.io/api/ Editor: Martin Thomson, w3cid 68503, Mozilla https://mozilla.org/, mt@mozilla.com Editor: Andy Leiserson, w3cid 147715, Mozilla https://mozilla.org/, aleiserson@mozilla.com Editor: Benjamin Savage, w3cid 114877, Meta https://www.meta.com/, btsavage@meta.com +Editor: Benjamin Case, w3cid 128082, Meta https://www.meta.com/, bmcase@meta.com Abstract: This specifies a browser API for the measurement of advertising performance. The goal is to produce aggregate statistics about how advertising leads to conversions, without creating a risk to the privacy of individual web users. This API collates information about people from multiple web origins, which could be a significant risk to their privacy. To manage this risk, the information that is gathered is aggregated using an aggregation service that is chosen by websites and trusted to perform aggregation within strict limits. Noise is added to the aggregates produced by this service to provide differential privacy. Status Text: This specification is a proposal that is intended to be migrated to the W3C standards track. It is not a standard. Text Macro: LICENSE W3C Software and Document License @@ -612,7 +613,7 @@ The arguments to measureConversion() are as
The maximum [=conversion value=] across all contributions included in the aggregation. Together with epsilon, this is used to calibrate the distribution of random noise that - will be added to the outcome. It is also used to determine the amount of [=privacy budget=] + will be added to the outcome. It is also used to determine the amount of [=privacy budget=] to expend on this [=conversion report=].
lookbackDays
@@ -623,8 +624,8 @@ The arguments to measureConversion() are as
A list of impression sites. Only [=impressions=] recorded where the top-level site is on this list are eligible to match this [=conversion=].
intermediarySites
- A list of sites which called the saveImpression() API. - Only [=impressions=] recorded by scripts originating from one of the intermediary sites + A list of sites which called the saveImpression() API. + Only [=impressions=] recorded by scripts originating from one of the intermediary sites are eligible to match this [=conversion=].
@@ -1097,6 +1098,37 @@ conversion report. ### Privacy Budget Deduction ### {#dp-deduction} +When a conversion requests attribution the call includes several querier-provided +parameters: +1. the window of epochs to search for relevant events (`epochs` parameter); +2. the requested privacy budget (`requested_epsilon`); +3. the `filterData` value used for selecting relevant events; +4. the `PrivateAttributionLogic` such as last-touch or equal-credit; +5. two sensitivity parameters: `report_global_sensitivity` which is a cap on how much attributed +value can come from this one conversion (e.g. the conversion value) and `query_global_sensitivity` +which is a maximum sensitivity across all reports to be processed the aggregation query. +6. the p-norm to use when bounding the histogram contribution's sensitivity. 1-norm corresponding +to using Laplace noise in aggregation query and 2-norm for Gaussian noise. + +The algorithm for deducting budget and computing the attributed report will first look across +epochs for eligible impressions. It will deduct budget from any epoch with eligible +impressions. After budget has been deducted, impressions from epochs with sufficient budget will be considered for attribution. + +The following steps happen for every epoch in the window of epochs. +Step 1: select relevant impressions within an epoch using the `filterData`. +Step 2: For each epoch compute the individual privacy loss of the query following Thm 4 of (site). There are three cases +* Case 1: If the epoch has no relevant impressions the privacy loss is 0. +* Case 2: If the window of epochs contains only a single epoch, the `individual_sensitivity` is the p-norm of attribution function +applied to only the impressions in this epoch. The privacy loss deducted from the epoch's budget is +then `requested_epsilon * individual_sensitivity / query_global_sensitivity`. +* Case 3: If multiple epochs are considered, the privacy loss deducted from the epoch's budget is +`requested_epsilon * report_global_sensitivity / query_global_sensitivity` + +Step 3: Attempt to deduct the privacy of the epoch, if the filter has sufficient budget the impressions +are added to the set to be considered for attribution; otherise, they are dropped. +Step 4: The attribution function is applied across the eligible impression from all epochs (which had budget). +The browser ensures that p-norm of the attribution histogram is `<= report_global_sensitivity`. + When searching for impressions for the conversion report, the user agent deducts the specified ε value from the budget for the week in which those impressions were saved. From 2b96cfa8a2be27a8e835dfa5c99dc92d2fcbea80 Mon Sep 17 00:00:00 2001 From: Benjamin Case Date: Wed, 2 Oct 2024 15:34:47 -0400 Subject: [PATCH 2/6] checkpoint --- api.bs | 6 ------ 1 file changed, 6 deletions(-) diff --git a/api.bs b/api.bs index ffe07ad..da5f5a9 100644 --- a/api.bs +++ b/api.bs @@ -1129,13 +1129,7 @@ are added to the set to be considered for attribution; otherise, they are droppe Step 4: The attribution function is applied across the eligible impression from all epochs (which had budget). The browser ensures that p-norm of the attribution histogram is `<= report_global_sensitivity`. -When searching for impressions for the conversion report, -the user agent deducts the specified ε value from -the budget for the week in which those impressions were saved. -If the privacy budget for that week is not sufficient, -the impressions from that week are not used. -The details of how to deduct privacy budget is given below ... WIP
In the following figure, From 2a4fd230c25cf8367ee31c33d5d965a69e509069 Mon Sep 17 00:00:00 2001 From: Benjamin Case Date: Wed, 2 Oct 2024 15:41:15 -0400 Subject: [PATCH 3/6] citation and markdown fixes --- api.bs | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/api.bs b/api.bs index da5f5a9..8e2f45a 100644 --- a/api.bs +++ b/api.bs @@ -1105,27 +1105,29 @@ parameters: 3. the `filterData` value used for selecting relevant events; 4. the `PrivateAttributionLogic` such as last-touch or equal-credit; 5. two sensitivity parameters: `report_global_sensitivity` which is a cap on how much attributed -value can come from this one conversion (e.g. the conversion value) and `query_global_sensitivity` -which is a maximum sensitivity across all reports to be processed the aggregation query. + value can come from this one conversion (e.g. the conversion value) and `query_global_sensitivity` + which is a maximum sensitivity across all reports to be processed the aggregation query. 6. the p-norm to use when bounding the histogram contribution's sensitivity. 1-norm corresponding -to using Laplace noise in aggregation query and 2-norm for Gaussian noise. + to using Laplace noise in aggregation query and 2-norm for Gaussian noise. -The algorithm for deducting budget and computing the attributed report will first look across +The algorithm to deduct privacy budget and compute the attributed histogram will first look across epochs for eligible impressions. It will deduct budget from any epoch with eligible impressions. After budget has been deducted, impressions from epochs with sufficient budget will be considered for attribution. The following steps happen for every epoch in the window of epochs. Step 1: select relevant impressions within an epoch using the `filterData`. -Step 2: For each epoch compute the individual privacy loss of the query following Thm 4 of (site). There are three cases +Step 2: For each epoch compute the individual privacy loss of the query following Thm 4 of [[PPA-DP]]. There are three cases * Case 1: If the epoch has no relevant impressions the privacy loss is 0. * Case 2: If the window of epochs contains only a single epoch, the `individual_sensitivity` is the p-norm of attribution function -applied to only the impressions in this epoch. The privacy loss deducted from the epoch's budget is -then `requested_epsilon * individual_sensitivity / query_global_sensitivity`. + applied to only the impressions in this epoch. The privacy loss deducted from the epoch's budget is + then `requested_epsilon * individual_sensitivity / query_global_sensitivity`. * Case 3: If multiple epochs are considered, the privacy loss deducted from the epoch's budget is -`requested_epsilon * report_global_sensitivity / query_global_sensitivity` + `requested_epsilon * report_global_sensitivity / query_global_sensitivity` Step 3: Attempt to deduct the privacy of the epoch, if the filter has sufficient budget the impressions are added to the set to be considered for attribution; otherise, they are dropped. + +After every epoch has been considered separately, the final step is run across all epochs. Step 4: The attribution function is applied across the eligible impression from all epochs (which had budget). The browser ensures that p-norm of the attribution histogram is `<= report_global_sensitivity`. From 978b16c7de444b7b525ea778919ae6b166cbf9fb Mon Sep 17 00:00:00 2001 From: Benjamin Case Date: Wed, 2 Oct 2024 15:43:40 -0400 Subject: [PATCH 4/6] line indents --- api.bs | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/api.bs b/api.bs index 8e2f45a..447935c 100644 --- a/api.bs +++ b/api.bs @@ -1105,10 +1105,10 @@ parameters: 3. the `filterData` value used for selecting relevant events; 4. the `PrivateAttributionLogic` such as last-touch or equal-credit; 5. two sensitivity parameters: `report_global_sensitivity` which is a cap on how much attributed - value can come from this one conversion (e.g. the conversion value) and `query_global_sensitivity` - which is a maximum sensitivity across all reports to be processed the aggregation query. + value can come from this one conversion (e.g. the conversion value) and `query_global_sensitivity` + which is a maximum sensitivity across all reports to be processed the aggregation query. 6. the p-norm to use when bounding the histogram contribution's sensitivity. 1-norm corresponding - to using Laplace noise in aggregation query and 2-norm for Gaussian noise. + to using Laplace noise in aggregation query and 2-norm for Gaussian noise. The algorithm to deduct privacy budget and compute the attributed histogram will first look across epochs for eligible impressions. It will deduct budget from any epoch with eligible @@ -1119,10 +1119,10 @@ Step 1: select relevant impressions within an epoch using the `filterData`. Step 2: For each epoch compute the individual privacy loss of the query following Thm 4 of [[PPA-DP]]. There are three cases * Case 1: If the epoch has no relevant impressions the privacy loss is 0. * Case 2: If the window of epochs contains only a single epoch, the `individual_sensitivity` is the p-norm of attribution function - applied to only the impressions in this epoch. The privacy loss deducted from the epoch's budget is - then `requested_epsilon * individual_sensitivity / query_global_sensitivity`. + applied to only the impressions in this epoch. The privacy loss deducted from the epoch's budget is + then `requested_epsilon * individual_sensitivity / query_global_sensitivity`. * Case 3: If multiple epochs are considered, the privacy loss deducted from the epoch's budget is - `requested_epsilon * report_global_sensitivity / query_global_sensitivity` + `requested_epsilon * report_global_sensitivity / query_global_sensitivity` Step 3: Attempt to deduct the privacy of the epoch, if the filter has sufficient budget the impressions are added to the set to be considered for attribution; otherise, they are dropped. From dc7d57d941ac21c2cc52d96db875ba8fe18d4474 Mon Sep 17 00:00:00 2001 From: Benjamin Case Date: Thu, 3 Oct 2024 21:47:42 -0400 Subject: [PATCH 5/6] only specify L1-norm, not p-norm --- api.bs | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/api.bs b/api.bs index 447935c..26eb35d 100644 --- a/api.bs +++ b/api.bs @@ -1107,8 +1107,6 @@ parameters: 5. two sensitivity parameters: `report_global_sensitivity` which is a cap on how much attributed value can come from this one conversion (e.g. the conversion value) and `query_global_sensitivity` which is a maximum sensitivity across all reports to be processed the aggregation query. -6. the p-norm to use when bounding the histogram contribution's sensitivity. 1-norm corresponding - to using Laplace noise in aggregation query and 2-norm for Gaussian noise. The algorithm to deduct privacy budget and compute the attributed histogram will first look across epochs for eligible impressions. It will deduct budget from any epoch with eligible @@ -1118,7 +1116,7 @@ The following steps happen for every epoch in the window of epochs. Step 1: select relevant impressions within an epoch using the `filterData`. Step 2: For each epoch compute the individual privacy loss of the query following Thm 4 of [[PPA-DP]]. There are three cases * Case 1: If the epoch has no relevant impressions the privacy loss is 0. -* Case 2: If the window of epochs contains only a single epoch, the `individual_sensitivity` is the p-norm of attribution function +* Case 2: If the window of epochs contains only a single epoch, the `individual_sensitivity` is the L1-norm of attribution function applied to only the impressions in this epoch. The privacy loss deducted from the epoch's budget is then `requested_epsilon * individual_sensitivity / query_global_sensitivity`. * Case 3: If multiple epochs are considered, the privacy loss deducted from the epoch's budget is @@ -1129,7 +1127,7 @@ are added to the set to be considered for attribution; otherise, they are droppe After every epoch has been considered separately, the final step is run across all epochs. Step 4: The attribution function is applied across the eligible impression from all epochs (which had budget). -The browser ensures that p-norm of the attribution histogram is `<= report_global_sensitivity`. +The browser ensures that the L1-norm of the attribution histogram is `<= report_global_sensitivity`. From 212e7a9dcaab15545e5c116e174ca3f4aecd4bee Mon Sep 17 00:00:00 2001 From: Benjamin Case Date: Thu, 3 Oct 2024 22:06:16 -0400 Subject: [PATCH 6/6] fmt --- api.bs | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/api.bs b/api.bs index 26eb35d..65334ef 100644 --- a/api.bs +++ b/api.bs @@ -7,7 +7,7 @@ Editor: Martin Thomson, w3cid 68503, Mozilla https://mozilla.org/, mt@mozilla.co Editor: Andy Leiserson, w3cid 147715, Mozilla https://mozilla.org/, aleiserson@mozilla.com Editor: Benjamin Savage, w3cid 114877, Meta https://www.meta.com/, btsavage@meta.com Editor: Benjamin Case, w3cid 128082, Meta https://www.meta.com/, bmcase@meta.com -Abstract: This specifies a browser API for the measurement of advertising performance. The goal is to produce aggregate statistics about how advertising leads to conversions, without creating a risk to the privacy of individual web users. This API collates information about people from multiple web origins, which could be a significant risk to their privacy. To manage this risk, the information that is gathered is aggregated using an aggregation service that is chosen by websites and trusted to perform aggregation within strict limits. Noise is added to the aggregates produced by this service to provide differential privacy. +Abstract: This specifies a browser API for the measurement of advertising performance. The goal is to produce aggregate statistics about how advertising leads to conversions, without creating a risk to the privacy of individual web users. This API collates information about people from multiple web origins, which could be a significant risk to their privacy. To manage this risk, the information that is gathered is aggregated using an aggregation service that is trusted by the user-agent to perform aggregation within strict limits. Noise is added to the aggregates produced by this service to provide differential privacy. Websites may select an aggregation service from the list of approved aggregation services provided by the user-agent. Status Text: This specification is a proposal that is intended to be migrated to the W3C standards track. It is not a standard. Text Macro: LICENSE W3C Software and Document License Complain About: accidental-2119 yes, missing-example-ids yes @@ -1113,7 +1113,9 @@ epochs for eligible impressions. It will deduct budget from any epoch with eligi impressions. After budget has been deducted, impressions from epochs with sufficient budget will be considered for attribution. The following steps happen for every epoch in the window of epochs. + Step 1: select relevant impressions within an epoch using the `filterData`. + Step 2: For each epoch compute the individual privacy loss of the query following Thm 4 of [[PPA-DP]]. There are three cases * Case 1: If the epoch has no relevant impressions the privacy loss is 0. * Case 2: If the window of epochs contains only a single epoch, the `individual_sensitivity` is the L1-norm of attribution function