Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove not needed validation on privacy_id_extractor #499

Merged
merged 1 commit into from
Oct 25, 2023

Conversation

dvadym
Copy link
Collaborator

@dvadym dvadym commented Oct 24, 2023

When contribution_bounds_already_enforced = True then PipelineDP does no contribution bounding. As a privacy id is not needed. The current validation requires privacy_id_extractor to be None, but there is no problem if privacy id extractor is set up. But it's pretty confusing for users why privacy_id_extractor has to be None. Let's drop this validation for simplicity

@dvadym dvadym changed the title (WIP) Remove not needed validation Remove not needed validation on privacy_id_extractor Oct 25, 2023
@@ -420,10 +420,6 @@ def _check_aggregate_params(self,
if check_data_extractors:
_check_data_extractors(data_extractors)
if params.contribution_bounds_already_enforced:
if data_extractors.privacy_id_extractor:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to log a warning? Clients might have an incorrect configuration in this case.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that having privacy_id_extractor is a correct configuration, even if no contribution bounding happens, e.g. in principle it would allow to compute PRIVACY_ID_COUNT per partition. So let's keep w/o warnings.

Copy link
Collaborator Author

@dvadym dvadym left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review!

@@ -420,10 +420,6 @@ def _check_aggregate_params(self,
if check_data_extractors:
_check_data_extractors(data_extractors)
if params.contribution_bounds_already_enforced:
if data_extractors.privacy_id_extractor:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say that having privacy_id_extractor is a correct configuration, even if no contribution bounding happens, e.g. in principle it would allow to compute PRIVACY_ID_COUNT per partition. So let's keep w/o warnings.

@dvadym dvadym merged commit 7740030 into OpenMined:main Oct 25, 2023
10 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants