Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utility vs Privacy for attribution proposals #3

Open
bmilekic opened this issue Feb 10, 2022 · 8 comments
Open

Utility vs Privacy for attribution proposals #3

bmilekic opened this issue Feb 10, 2022 · 8 comments

Comments

@bmilekic
Copy link

During the meeting yesterday the tradeoff between certain privacy considerations and utility was briefly discussed during @csharrison presentation. We ran out of time before we could get into the nitty gritty, but I think the question on this tradeoff as it pertains to the various attribution proposals (IPA, PCM, ARA) is critical.

On one hand, insufficient privacy protections (as deemed appropriate by browser vendors and consensus built here) is important and I imagine will be addressed in part by the privacy principles we will document, but on the other, limited utility could further encourage companies in the advertising ecosystem to [continue to] request and collect PII from end users (think Email addresses via login walls) so that they can get better utility "out of band" by matching them and the events that reference them with their partners. While not possible to prevent the latter altogether, I would argue that we could greatly discourage it by developing proposals and mechanisms that offer comparable utility.

So since we are going to be documenting privacy principles, do we think it would be possible to articulate what constitutes "acceptable utility"? As an example, which of the following are must have vs nice to have:

Does the attribution proposal being considered:

  • Enable cross-device and cross-browser attribution measurement?
  • Can be extended to support different types of attribution? e.g., fractional credit to individual impression events leading up to conversion vs last event, etc.
  • Time delays acceptable?
  • Fairness: ability for third-party vendors acting on behalf of a publisher or an advertiser, or both, to submit and receive reports
  • What else?

Apologies if this is already addressed elsewhere or in another WG/CG, please point me there if that's the case.

@ssanjay-saran
Copy link

Great point @bmilekic! We should come up with a framework to cover all the dimensions (privacy and utility may not necessarily be the only 2 here). We probably need a requirements gathering exercise from the following constituents:

  1. Platform requirements (browser, mobile OS likely have user privacy, user consent, and performance [battery drain, memory utilization] requirements)
  2. Publisher requirements (ie., media owners likely have privacy, data security, fairness & transparency requirements)
  3. Advertiser/marketer requirements (ie., ease of use/integration, frequency and latency of reports, methods and metrics)
  4. Other (Regulatory, legal, etc.,)

Pretty sure I'm missing some others. Pls feel free to add.

@ghost
Copy link

ghost commented Feb 10, 2022

@ssanjay-saran This was actually going to be my point last night. How do these solutions fit into the the overall ecosystem for Adtech/Martech? Realize that may need to come later.

@bmayd
Copy link

bmayd commented Feb 11, 2022

Seems like we ought to include Users/Consumers as a constituency, though requirements gathering is less straightforward.

To 1 (Platform) I would add: error/exception reporting; with information flows restricted, the ability to know about and diagnose problems will have greater dependency on upon support from platforms.

To 3 (Advertiser) I would add: fairness & transparency as well; there should be a means of validating reports are accurate and complete which is independent of the reporting entity.

I assume requirements for each constituency would be the same for agents acting on behalf of the constituency, but maybe not always.

Might also useful to consider global requirements for things like auditing and attestation.

@bmilekic
Copy link
Author

Would the Advertising Use Cases document be a good starting point to source use cases to evaluate considered proposals (or eventual extensions of considered proposals) against? It already attempts to exhaustively map out use cases to Chrome and Safari (and some of the older community) proposals. We could probably focus on the attribution related use cases, e.g.,:

  • Conversion Lift Measurement
  • Brand Lift Measurement
  • Click-through attribution
  • View-through attribution
  • Multi-touch attribution
  • Cross Browser / Cross Device Measurement
  • Multi-channel attribution
  • We may want to separately consider ML modelling specific use-cases, in particular Conversion Rate (CVR) Model: P(conversion|click), Post-view attribution: P(conversion|impression), and ROAS optimization.

We can of course add additional considerations to the list. For example (borrowing from @bmayd and @ssanjay-saran):

  • Report latency anticipated constraints
  • Report frequency anticipated constraints
  • Anticipated energy consumption for user agents (e.g., mobile battery drain)
  • Ability to debug and/or tune by operators (probably applies more to LARk and ML use cases)
  • Fairness: for conversion measurement, are both the advertiser and publisher (or their representatives/vendors) able to independently verify/calculate reported results?

Finally, I propose that we consider privacy and security separately to utility. I suspect that some proposals will fair better than others from a privacy and security standpoint, while on the utility front a different picture may emerge.

@bmayd
Copy link

bmayd commented Feb 13, 2022

Finally, I propose that we consider privacy and security separately to utility.

I'm not clear what you are suggesting here, can you expand on it a little?

@bmilekic
Copy link
Author

Finally, I propose that we consider privacy and security separately to utility.

I'm not clear what you are suggesting here, can you expand on it a little?

Sure, and apologies for the lack of clarity. What I meant is that I think that it's easier to analyze the proposals in terms of utility that each provides, without thinking too much about their privacy and security characteristics. This does not mean that the privacy and security principles and characteristics are unimportant or that we should not consider them, just that they can be looked at and discussed in parallel.

I initially created this issue because when I looked at @csharrison attribution reporting design space tradeoffs and considerations slides, the third slide "Privacy vs Utility" lists "timely, fine-grained, accurate" as examples of utility, and I thought it would be good to expand on those three points further, to get a more detailed view of how the proposals compare w.r.t. utility provided.

As one example, if we consider "Cross Browser / Cross Device Measurement" as being useful and highly desirable for advertisers, which of the current proposals provide a way to achieve it? From reading the Mozilla/Meta IPA proposal, it seems that cross-device attribution measurement will be possible within the described framework, assuming match key providers participate at scale. How about with the other proposals? Note that I'm not trying to claim that in and of itself "Cross Browser / Cross Device Measurement" is imperative to have on day 1, but I think that looking at the proposals side by side and comparing them in terms of supported capabilities (i.e., utility) will help us understand and compare them along that dimension better.

Hope that helps!

@bmayd
Copy link

bmayd commented Feb 13, 2022

it's easier to analyze the proposals in terms of utility that each provides, without thinking too much about their privacy and security characteristics.

Thanks, very helpful and I agree: let's be confident approaches provide sufficient utility that putting effort into understanding their security and privacy implications is worth doing. I think utility here is principally the satisfaction of use-cases, but also doing so at cost and complexity levels that are comfortable and sustainable for intended adopters.

So I'm thinking something like the following order of evaluation:

  • What use-case(s) does the model address?
  • Can the model be deployed successfully by a majority of target adopters, either independently or through collaboration?
  • Are the resource requirements of operating the model sustainable for target adopters?
  • Can the model be implemented such that it does not create security threats?
  • Can the model be implemented such that it does not create privacy threats?

@anderagakura
Copy link

@bmilekic During the process, I think the privacy and security characteristics should be discussed. Of course, when you make a product and work on privacy matter, you could get a risk of overthinking and even rewriting the law. But if we do not consider it during the process and seriously, we could get a finale product without enough privacy and security characteristics to be used or could be used only in some geo area. That would be great to make products that can be used in any area from the trial period to live release.

Sure, and apologies for the lack of clarity. What I meant is that I think that it's easier to analyze the proposals in terms of utility that each provides, without thinking too much about their privacy and security characteristics. This does not mean that the privacy and security principles and characteristics are unimportant or that we should not consider them, just that they can be looked at and discussed in parallel.

If the product does not require specific "help" from the user, we can fully expand on those three points. But for example, most of the products need to store some info about the user in the browser and according to GDPR, you need to get the consent. Just taking this example and creating a product, the use and the result can be tricked. And then come to the discussion about Opt-in / Opt-out by default... Need to find the right balance in this slide but we cannot omit that.

I initially created this issue because when I looked at @csharrison attribution reporting design space tradeoffs and considerations slides, the third slide "Privacy vs Utility" lists "timely, fine-grained, accurate" as examples of utility, and I thought it would be good to expand on those three points further, to get a more detailed view of how the proposals compare w.r.t. utility provided.

It's worth doing and we need it in order to restore the trust around advertising ecosystem

@AramZS AramZS transferred this issue from patcg/meetings Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants