Feedback on Contribution bounding value, scope, and epsilon #23

alexmturner · 2023-03-14T19:37:31Z

Hi all,

We're seeking some feedback on the Private Aggregation API's contribution budget. We'd appreciate any thoughts on both the value of the numeric bound as well as the scope (currently per-origin per-day and separate for FLEDGE and Shared Storage).

In particular, one change we're considering is moving the scope from per-origin to per-site. This would mitigate abuse potential for cases of things like wildcard domains which are (arguably) easier to mint than domains to exceed privacy limits. (See more discussion here.)

Thanks!

[January 2024 edit:] Additionally we would like to open the scope of this issue to understand feedback on epsilon. The Aggregation Service currently supports a range up to 64. Note that the Aggregation Service adds noise to summary reports that is distributed according to the Laplace distribution with a mean of zero and a standard deviation

sqrt(2) * L1 / epsilon

where L1 is currently 2^16. We are interested in understanding the smallest value of epsilon required the support the minimum viable functionality of your system

The text was updated successfully, but these errors were encountered:

alexmturner · 2023-04-07T21:07:21Z

In addition to the change from per-origin to per-site, we're considering changing the time component of the contribution bound. Specifically, we're considering moving the existing contribution bound (max value sum of 2^16) to applying over a 10 minute window instead of a daily window. We hope this will allow more flexibility and simplify budget management. As a backstop to prevent worst-case leakage, we're considering a new larger daily bound, e.g. 2^20. We'd appreciate any feedback on this proposal!

Switches from per-origin per-day budgets to per-site per-10 min. Also adds a new per-site per-day looser budget as a backstop to prevent worst-case leakage. See original proposal here: #23 (comment)

xottabut · 2023-06-13T10:26:17Z

Hi Alex (@alexmturner ), I have a few questions regarding budget in context of Private Aggregation API.
I am reading this document to understand the budget for Private Aggregation API contribution budget

Is L1 budget per (1) one event (i.e. ad impression) or (2) all events that happens during the period X (last 24 hours)? The documentation says: "each user agent will limit the contribution that it could make to the output of a query. In the case of a histogram operation, the user agent could bound the L1 norm of the values, i.e. the sum of all the contributions across all buckets". It is also not completely clear, what is the query here.
The document says: "We initially plan to use an L1 bound of 2^16 = 65 536" and then later on the same page: "We plan to enforce a per-site budget that resets every 10 minutes; that is, we will bound the contributions that any site can make to a histogram over any 10 minute window. As a backstop to limit worst-case leakage, we plan a separate, looser per-site bound that resets daily, limiting the daily L1 norm to 2^20 = 1 048 576." so what are the final limitations? 2^16 for any 10 minutes window and 2^20 for last 24 hours (two different limits or only one of them)?
The "site" in "per-site" or "origin" in "per-origin" is it referring to the publisher site(page) or reporting origin?

Another two documents with information about the budget but in Attribution Reporting API are:

https://github.com/WICG/attribution-reporting-api/blob/main/AGGREGATE.md#contribution-bounding-and-budgeting
experiments with Attribution Reporting API, section about the budget:
https://docs.google.com/document/d/1bU0a_njpDcRd9vDR0AJjwJjrf3Or8vAzyfuK8JZDEfo/edit#bookmark=id.ogwovwz4ufo7 (from here I understand that in attribution API it is about limit per source/trigger event, that's why I have a question about "per event in Private Aggregation API")

Thanks!

alexmturner · 2023-06-28T16:12:27Z

Hi! Sorry for the delay in responding.

The L1 budget for Private Aggregation is over a time period as there isn't a clear notion of event in this API. We limit the sum of contributions' values to 2^16 over a rolling 10 min window for any site. Additionally, we limit the sum of contributions' values to 2^20 over a rolling 24 hour window for any site.
The query in that first bullet point is referring to a query to the aggregation service.
The origin/site for budgeting is the reporting origin/site.

Hope this answers your questions, but let me know if anything is still unclear :)

alexmturner · 2023-06-28T16:12:37Z

Closing as this change has been made

xottabut · 2023-06-28T17:33:30Z

Thank you Alex for the response.
Sorry, but I feel like I am missing something here about the "each user agent will limit the contribution that it could make to the output of a query."

If query refers to a query to the aggregation service or in other words one aggregation service job that takes one batch of the aggregatable reports does it mean that in the next case the user contribution will be at maximum 65 536?
Case:
User contributes 1 aggregation key key_1=65 536 at 00:00 then same user contributes key_1=65 536 (or even key_2=65 536) at 00:15 (which is allowed by user agent limit). But on the ad-tech side these two reports are collected to one batch and in total contribute 2 * 65536 which is over the mentioned limit the contribution will be either lost or cut down to 65536?

alexmturner · 2023-06-28T18:26:54Z

Ah yes this wording is a bit confusing; I'll follow up to improve it. The idea is that the user agent is limiting the contribution it can make to the output of a query -- but you're right that that limit isn't a single number, rather a 'rate' over time depending on when the reports were triggered.

Clarify that limits are inherently a rate, not an absolute number. Addresses comments raised in #23.

chrisbmcandrew · 2023-11-16T20:06:20Z

We continue to be excited about the Aggregation Service API, and its ability to combine with Shared Storage to deliver advanced reach reporting as previously mentioned.

We believe that making adjustments to the contribution budget would ensure the functionality for Reach & Frequency measurement. Brand advertisers specifically rely on accurate Reach measurement to measure the performance of their campaigns across the web and without a reasonable contribution budget the accuracy and effectiveness of Reach measurement would be greatly impacted. The two settings are:

A per-site budget that resets every 10 minutes.

The reporting window causes limits to Reach reporting for campaigns that deliver multiple ad events within the 10 minutes, as additional events outside of the contribution budget would be dropped. Given how users browse the web a single ten minute window can have significant opportunities for ad delivery and a large subset of ad events would be lost. This loss results in wide standard deviation on per-campaign reach reporting, which would limit the creation and usefulness of reports generated from these campaigns. Reducing the overall backstop cap would still allow for a reasonable limitation while similarly ensuring measurement that aligns to how users experience ads across a single session and day.

A backstop per-site 24 hour bound limiting to X^x (currently limit is L1 norm to 2^20 = 1 048 576).

Based on how users browse the web, the combination of a rolling window and a daily cap creates additional loss. Again the result of this is a wider range of reported Reach values and an impact to the usefulness of the output. Reducing the overall backstop cap would still allow for a limitation while similarly ensuring measurement that aligns to how users experience ads across a single session and day .

In both cases reported numbers are in aggregate and use Virtual Persons methodology that maintains the overall privacy goals. We look forward to an update on these two settings to ensure Brand advertising is maintained while still providing a safe and private API.

menonasha · 2023-12-01T02:39:32Z

Appreciate the feedback - reopening this issue for discussion - we will come back with thoughts.

We wanted to clarify - is the feedback that the use case requires increased budgets for both the 10 minute cap and daily backstop? Wanted to ask since you mention reducing the overall backstop cap in both paragraphs

chrisbmcandrew · 2023-12-07T17:22:05Z

@alexmturner @menonasha Yes, the impact of both is that typical browsing behavior across 10 min windows and across a day has significant opportunities for campaigns to Reach users and a large subset of ad events would be lost. Loss of a large quantity of these events, either due to a 10 cap or 1 day cap, results in unmeasurable Reach and Frequency which is critical to brand advertisers.

menonasha · 2023-12-21T20:08:58Z

We do understand that the contribution budget window could cause events to be dropped if a user is served a significant number of ads during the window. Ad techs should consider optimizing for the contribution budget such as by accounting for different campaign sizes or limiting the number of reports per campaign per user in a ten minute window. We would be interested to understand from the ecosystem whether the contribution budget still causes inaccurate Reach measurements after implementing optimization tactics.

In terms of providing additional budget either at the ten minute window or at the daily window, this change would allow for larger potential information gain on a given user, and so this is not in our immediate roadmap.

We would be interested to hear additional ideas of improvements we could make to solve this challenge of losing ad events while maintaining privacy. We welcome additional feedback and public discussion on this subject as we work towards a solution over the long term that addresses these concerns.

alexmturner · 2024-01-11T20:07:42Z

We have added this context in the original post as well but we would like to open the scope of this issue to understand feedback on epsilon. The Aggregation Service currently supports a range up to 64. Note that the Aggregation Service adds noise to summary reports that is distributed according to the Laplace distribution with a mean of zero and a standard deviation

sqrt(2) * L1 / epsilon

where L1 is currently 2^16. We are interested in understanding the smallest value of epsilon required the support the minimum viable functionality of your system.

alexmturner mentioned this issue Mar 15, 2023

Privacy budget with Private Aggregation API WICG/turtledove#428

Open

alexmturner mentioned this issue May 16, 2023

Update contribution bound and add new 'backstop' bound #45

Merged

alexmturner closed this as completed Jun 28, 2023

alexmturner added a commit that referenced this issue Jul 13, 2023

Improving description of contribution limits

1018e36

Clarify that limits are inherently a rate, not an absolute number. Addresses comments raised in #23.

alexmturner mentioned this issue Jul 13, 2023

Improving description of contribution limits #85

Merged

alexmturner added a commit that referenced this issue Jul 13, 2023

Improving description of contribution limits (#85)

b584263

Clarify that limits are inherently a rate, not an absolute number. Addresses comments raised in #23.

alexmturner reopened this Dec 1, 2023

alexmturner changed the title ~~Contribution bounding value and scope~~ Feedback on Contribution bounding value, scope, and epsilon Jan 11, 2024

alexmturner added the looking for feedback label Jun 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feedback on Contribution bounding value, scope, and epsilon #23

Feedback on Contribution bounding value, scope, and epsilon #23

alexmturner commented Mar 14, 2023 •

edited

Loading

alexmturner commented Apr 7, 2023

xottabut commented Jun 13, 2023

alexmturner commented Jun 28, 2023

alexmturner commented Jun 28, 2023

xottabut commented Jun 28, 2023

alexmturner commented Jun 28, 2023

chrisbmcandrew commented Nov 16, 2023

menonasha commented Dec 1, 2023

chrisbmcandrew commented Dec 7, 2023

menonasha commented Dec 21, 2023

alexmturner commented Jan 11, 2024

Feedback on Contribution bounding value, scope, and epsilon #23

Feedback on Contribution bounding value, scope, and epsilon #23

Comments

alexmturner commented Mar 14, 2023 • edited Loading

alexmturner commented Apr 7, 2023

xottabut commented Jun 13, 2023

alexmturner commented Jun 28, 2023

alexmturner commented Jun 28, 2023

xottabut commented Jun 28, 2023

alexmturner commented Jun 28, 2023

chrisbmcandrew commented Nov 16, 2023

menonasha commented Dec 1, 2023

chrisbmcandrew commented Dec 7, 2023

menonasha commented Dec 21, 2023

alexmturner commented Jan 11, 2024

alexmturner commented Mar 14, 2023 •

edited

Loading