Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Partial conversion values #16

Closed
martinthomson opened this issue Sep 23, 2024 · 9 comments
Closed

Partial conversion values #16

martinthomson opened this issue Sep 23, 2024 · 9 comments
Labels
discuss Needs working group discussion

Comments

@martinthomson
Copy link
Member

martinthomson commented Sep 23, 2024

I had been assuming we could report partial conversion value if there is insufficient but non-zero privacy budget. Since the privacy budget is likely to be chosen by the browser (and may vary between browsers and over time), the conversion site doesn't necessarily know the available budget.

Originally posted by @andyleiserson in #11 (comment)

@csharrison
Copy link
Collaborator

@martinthomson you mentioned in the PR you "have a preference for avoiding that sort of thing". Can you expand on it?

It's a bit hard to reason about, but overall I don't see a huge problem with this. My intuition is that is probably a minor utility win at no cost to privacy.

@martinthomson
Copy link
Member Author

martinthomson commented Jan 29, 2025

This isn't a privacy question as much as it is a question about what sort of functionality we present to sites.

If someone wants to submit a value of 10 with epsilon 5 and they only have epsilon 1 of their budget remaining, there would seem to be two options:

  1. Spend the partial budget and cap the contribution (to 2 in this case)
  2. Don't spend the budget and send a zero

The feedback I've gotten from people is that they would like budget exhaustion to have as predictable an effect on the result.

On reflection, I don't know what the best option is. I don't know which option is most predictable, though I will observe that if we reserve budget for reporting on truncation, the amount we need might increase if we want to indicate that partial spend has occurred, as opposed to a simple boolean (i.e, a count) of the number of (wholly) lost conversions.

I do have a preference for fixing this in the specification, rather than making it a choice, but even that is open to debate. What is your preference?

@martinthomson martinthomson added the discuss Needs working group discussion label Jan 29, 2025
@martinthomson martinthomson moved this to Essential in PPA API, Level 1 Feb 3, 2025
@csharrison
Copy link
Collaborator

Thinking about this more, I actually think spending a partial budget should not be allowed as it allows violations of the privacy guarantee. Here is an example, where each item is a new epoch:

  1. Impression 1
  2. Impression 2
  3. Impression 3
  4. Conversion spending .5 matching imp 2 and 3 → (0, 0, .5)
  5. Conversion spending 1 matching imp 1 and 3 → (0, 0, .5) (partial budget spend happens here)

Neighbor removes imp 3. Conversions return (0, .5, 0) and (1, 0, 0), so the L1 diff = 1 + 1.5 = 2.5.
Without partial credit the first database returns (0, 0, .5), (0, 0, 0) and the neighbor returns (0, .5, 0) and (1, 0, 0) and the issue is resolved (L1 diff 2).

Silver lining I guess is if this is true, then it's one less thing we need to discuss 😆 ?

@martinthomson
Copy link
Member Author

Hmm, I think that we're resolving that a change to one epoch has -- at most -- an L1 difference of 2, so that would seem to blow the limit in an undesirable way.

An alternative interpretation would be to have queries capped by the budget that is available across all epochs that contain impressions. Then the two outcomes would be (0, 0, .5) and (0, 0, .5) for the case with impression 3 and (0, .5, 0) and (.5, 0, 0) for the one with. That is probably not an outcome that the paper contemplates though. (It is consistent with the description in the paper, but it requires a re-assessment.)

@csharrison
Copy link
Collaborator

An alternative interpretation would be to have queries capped by the budget that is available across all epochs that contain impressions. Then the two outcomes would be (0, 0, .5) and (0, 0, .5) for the case with impression 3 and (0, .5, 0) and (.5, 0, 0) for the one with. That is probably not an outcome that the paper contemplates though. (It is consistent with the description in the paper, but it requires a re-assessment.)

This doesn't seem like a great idea. I think it's a reasonable assumption that older epochs will have less budget, and limiting the contribution to recent epochs seems quite limiting. I'm not sure this "partial spend" functionality is worth it at that point.

@csharrison
Copy link
Collaborator

Thinking about this more, I actually think spending a partial budget should not be allowed as it allows violations of the privacy guarantee. Here is an example, where each item is a new epoch:

  1. Impression 1
  2. Impression 2
  3. Impression 3
  4. Conversion spending .5 matching imp 2 and 3 → (0, 0, .5)
  5. Conversion spending 1 matching imp 1 and 3 → (0, 0, .5) (partial budget spend happens here)

Neighbor removes imp 3. Conversions return (0, .5, 0) and (1, 0, 0), so the L1 diff = 1 + 1.5 = 2.5. Without partial credit the first database returns (0, 0, .5), (0, 0, 0) and the neighbor returns (0, .5, 0) and (1, 0, 0) and the issue is resolved (L1 diff 2).

Hm actually I think I made an error, it should be:

Without partial credit the first database returns (0, 0, .5), (1, 0, 0) and the neighbor returns (0, .5, 0) and (1, 0, 0) and the issue is resolved (L1 diff 1).

Since we treat the budget exhaustion as the same as if there are no matching impressions.

@csharrison
Copy link
Collaborator

We agreed to close this issue as wontfix in the 2/11 call.

@csharrison csharrison closed this as not planned Won't fix, can't repro, duplicate, stale Feb 11, 2025
@bmcase
Copy link
Contributor

bmcase commented Feb 12, 2025

I chatted about this further with Roxana and Pierre. The answer right now on the theory side is that: with the current theory, which relies on filters, adjusting sensitivity based on available budget is not justified analytically. Adapting the accounting theory to odometers could potentially permit this but it requires analytical work.

Thought it would be good to at least file that with this issue in case we pick it up again in relation to multi-touch attribution where I think in the meeting there was some interest to see this reconsidered.

@martinthomson
Copy link
Member Author

So for multi-touch, it seems like we'd be forced to spend all the budget in all epochs with impressions, even for a) epochs that add nothing to the final contribution and b) epochs that only contribute partially to the result. That would seem less than ideal, but I understand that we don't have the analytical framework that would allow us to do better. Hopefully we can do better, but that applies even for last-touch, where we spend budget for epochs 0 through 4 if they have impressions, even if the impression we use comes from epoch 5.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Needs working group discussion
Projects
None yet
Development

No branches or pull requests

3 participants