Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on Fenced_Frames_Ads_Reporting document #205

Open
vincent-grosbois opened this issue Jul 8, 2021 · 11 comments
Open

Questions on Fenced_Frames_Ads_Reporting document #205

vincent-grosbois opened this issue Jul 8, 2021 · 11 comments

Comments

@vincent-grosbois
Copy link

Hello
after reading this document : https://github.com/WICG/turtledove/blob/main/Fenced_Frames_Ads_Reporting.md I have a few questions:

  1. Is this document "officialy supported" and will be intergrated to Fledge ? or is it just a proposal for now ?
  2. Are 'eventData' in 'reportEvent' fully arbitrary data, or is it supposed to be within a list of possible data ?
  3. What is preventing "evendata" payload from containing PII info ? is it that because of the Fledge "micro targeting protection", it won't be possible to generate an ad in the fenced frame that contains data that is too specific ?
  4. Based on the code, it's possible to send buyer-centric data (encoded in eventData) to the seller report, like for instance sending it the name of the buyer, etc. Are we sure it's not an issue with Privacy?
  5. More generally, what's the use case for allowing the reports to be sent to the seller ? It seems that sending arbitrary info to both buyer and seller will easily allow for fingerprinting of the user. Example : generate a new UUID in event data, sent it in both buyer and seller report --> buyer and seller can collude and join data based on this UUID

Any thoughts on this?

@jeffkaufman
Copy link
Contributor

jeffkaufman commented Jul 8, 2021

Some thoughts as someone who's been following this and suggested something similar #99 (comment):

  1. Are 'eventData' in 'reportEvent' fully arbitrary data, or is it supposed to be within a list of possible data ?

My interpretation is that it is arbitrary data.

  1. What is preventing "eventData" payload from containing PII info? Is it that because of the Fledge "micro targeting protection", it won't be possible to generate an ad in the fenced frame that contains data that is too specific ?

That sounds right: eventData comes from reportEvent, which is inside the fenced frame. The fenced frame was created via a renderUrl which:

a) had to pass through k-anonymity filtering, and
b) is already available to both buyer and seller reporting through browserSignals

  1. Based on the code, it's possible to send buyer-centric data (encoded in eventData) to the seller report, like for instance sending it the name of the buyer, etc. Are we sure it's not an issue with Privacy?

What privacy issue do you see? For example, the seller already knows the name of the buyer through browserSignals. interestGroupOwner.

  1. More generally, what's the use case for allowing the reports to be sent to the seller ? It seems that sending arbitrary info to both buyer and seller will easily allow for fingerprinting of the user. Example : generate a new UUID in event data, sent it in both buyer and seller report --> buyer and seller can collude and join data based on this UUID

The buyer and seller can already join their event-level reports. One way to do this would be for the seller to generate an event id in reportResult and put it in signalsForWinner which would then be available to reportWin in sellerSignals. Alternatively, the buyer or seller could add an event id to perBuyerSignals, which is available to both reportResult and reportWin.

In general, if the buyer and seller can receive the same information it means you don't have to reconcile diverging interpretations of what happened in the browser and no one needs to take someone else's word for what happened. For example, if a buyer and seller agree to transact on a CPC basis then it's best if they can both trigger their reporting off of the same "click" event.

@vincent-grosbois
Copy link
Author

vincent-grosbois commented Jul 8, 2021 via email

@jeffkaufman
Copy link
Contributor

I think what you're missing is that userBiddingSignals and all other user-level advertiser information is available only to generateBid, and not reportWin or reportResult?

@vincent-grosbois
Copy link
Author

Ah! Indeed from the initial interest group, we only get interestGroupOwner and interestGroupName from browserSignals it seems.

Now the question is the following: what's preventing a buyer from generating a new interestGroupName for each user on his website ? basically "interestGroup_userId_". From the report sent via reportWin, you retrieve the interest group name (containing user id on buyer side) and some arbitrary payload from the seller (ie the user id from seller side). I guess this wouldn't work again, due to micro-targeting protection?

@jeffkaufman
Copy link
Contributor

In https://github.com/WICG/turtledove/blob/main/FLEDGE.md#5-event-level-reporting-for-now I see:

The renderUrl can always be included since it has already passed a k-anonymity check, for example, but the winning interestGroupName will only be present if it has exceeded the threshold which gates daily updates.

If you used unique values for interestGroupName they wouldn't meet the threshold, and so would not be available to reportWin or reportResult

@vincent-grosbois
Copy link
Author

vincent-grosbois commented Jul 8, 2021

Thanks I see !
So if I summarize:

  • it's intended that event-level reporting that be reconcilied at event level by the seller and buyer, because the buyer can recover any fine-grained data that the seller is sending for the buyer report, that also appears in his seller report (eg event ID generated from seller)
  • this event-level data can thus contain as many info as possible coming from seller-side (not necessarily sent via fledge reporting, but can be recovered from seller website internal DB, as the event ID key is unique)
  • from buyer side, the info we can recover linked to this event is only the IG owner (so a domain) and IG name (which can't be a PII), so overall no PII info

Indeed it seems like in that case there is no possible "leak" of info :) The only comment I have with this is that it's heavily biased towards seller. You could imagine that this reporting mechanism would have been done completely reversed, ie that in the final report you can add as many buyer-side info as possible, but you can only find out on seller-side the domain where the display occured.

@appascoe
Copy link
Collaborator

appascoe commented Jul 8, 2021

I think what you're missing is that userBiddingSignals and all other user-level advertiser information is available only to generateBid, and not reportWin or reportResult?

I would like to comment that this is a bit of a sticking point. In order to do any real machine learning or optimization, buyers need to be able to pass at least some of the userBiddingSignals into reportWin and reportResult. On our side, we were expecting that the Aggregate Reporting API would provide some k-anonymity checks to prevent any PII from leaking.

This issue was raised in #145 , but we haven't received a concrete response yet.

@vincent-grosbois
Copy link
Author

vincent-grosbois commented Jul 8, 2021

I may be wrong, but my assumption is that the real reports that will allow buyers to do machine learning etc are the reports that will occur through the measurement API . Either using aggregate reporting API or event-level conversion API.
So to me that's 2 other sets of reporting API that will exist and be compatible with Fledge, in addition to the reporting system we are discussing here (reportWin and reportResult), that is purely-fledge

@appascoe
Copy link
Collaborator

appascoe commented Jul 8, 2021

I've been operating under the assumption that, in the long run, reportWin and reportResult are entry points into the Aggregate Reporting API, not a separate mechanism. Of course, the Aggregate Reporting API isn't ready yet, and so in the interim, these provide more granular data.

@jeffkaufman
Copy link
Contributor

@appascoe you might be interested in #164 where we're asking for aggregate reporting during generateBid and scoreAd?

@appascoe
Copy link
Collaborator

appascoe commented Jul 9, 2021

Yeah, I'm interested, but not sure it solves the problem. Being able to do some logging from those functions is certainly useful, but what I would say is the most important place to submit feedback for aggregation is report_win. These other functions are too far up the chain. This is because a win is a strong filter for the user having a chance to interact with the ad, e.g. predicting click performance is predicated on the ad being displayed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants