Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow including master table in cascading delete #1158

Merged
merged 15 commits into from
Aug 19, 2024
Merged

Conversation

CBroz1
Copy link
Contributor

@CBroz1 CBroz1 commented Apr 24, 2024

#151 included a discussion of 'Option 2' in which the master would be deleted during a cascade, with a possible solution proposed in this PR. This feature would be particularly useful to the Spyglass team because our 'Merge tables' make use of foreign-key references in parts to version pipelines.

Happy to incorporate feedback

@CBroz1 CBroz1 marked this pull request as ready for review June 13, 2024 01:28
@lfrank
Copy link

lfrank commented Jul 9, 2024

@dimitri-yatsenko @ttngu207 Hi Dmitri and Thinh. I just wanted to check in on this pull request and another one (#1160). It would really help us to get these into a new release if that's possible.

@edeno
Copy link

edeno commented Aug 5, 2024

Hi @dimitri-yatsenko @ttngu207,

Sorry to bother again, but it would be helpful to know for our (Spyglass) planning purposes whether this PR is likely to get reviewed/merged in the near future. We know there are probably a lot of things on your plate and we understand if this is not among your highest priorities.

Thank you for all your work on this software.

@ethho
Copy link
Contributor

ethho commented Aug 5, 2024

it would be helpful to know for our (Spyglass) planning purposes whether this PR is likely to get reviewed/merged in the near future.

Hi @edeno, I'll review this by end of week. Thanks all for the PR.

@ttngu207 @dimitri-yatsenko @lfrank Do we need to publish a new release? Or is it sufficient to merge to main?

@ethho ethho requested a review from dimitri-yatsenko August 5, 2024 22:18
@edeno
Copy link

edeno commented Aug 5, 2024

It would be greatly helpful for us to have a release.

datajoint/table.py Outdated Show resolved Hide resolved
@ethho
Copy link
Contributor

ethho commented Aug 7, 2024

@CBroz1 I would recommend checking out changes from #1164 so that tests run.

@CBroz1
Copy link
Contributor Author

CBroz1 commented Aug 7, 2024

@CBroz1 I would recommend checking out changes from #1164 so that tests run.

Thanks @ethho! If you plan to merge that first, I'll keep an eye out and fetch upstream when it does

@ethho
Copy link
Contributor

ethho commented Aug 9, 2024

If you plan to merge that first, I'll keep an eye out and fetch upstream when it does

@CBroz1 #1164 was merged to master.

@CBroz1
Copy link
Contributor Author

CBroz1 commented Aug 9, 2024

Thanks, @ethho. I fetched from upstream

Copy link
Contributor

@ethho ethho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

Looks good overall. I request the following changes:

  1. Update documentation to reflect the new behavior. It is sufficient to update this section on deletes.
  2. Rename kwarg to force_masters as discussed above
  3. Expand regression tests to cover edge cases

Edge Cases

The logic changes in delete are sensible and address the requirements laid out in #151.

While I can't think of any breaking edge cases, @ttngu207 and I think that we should include some of these cases in regression tests.
Notably, we should test what happens when a part table is visited more than once in the cascade.
The new tests cover the case where the master table is visited twice, and we save some work by skipping visited_masters.
But what if a part table shares a master with its children or grandchildren?
cascade will visit the downstream part table twice.
This is harmless currently because quick_delete is idempotent, but it's worth testing in case delete behavior changes. For example, we could add a table to schema_simple:

class I(dj.Part):
    definition = """ # test no additional fk reference
    -> E
    id_i :int
    ---
    -> H
    """

Kwarg Interaction

No change requested, but note that my_part_table.delete(force_parts=False, force_masters=True) will delete the master table, even though force_parts=False. This behavior makes sense to me: force_masters=True will always supercede force_parts.

datajoint/table.py Outdated Show resolved Hide resolved
datajoint/table.py Outdated Show resolved Hide resolved
datajoint/table.py Show resolved Hide resolved
tests/test_cascading_delete.py Outdated Show resolved Hide resolved
tests/test_cascading_delete.py Outdated Show resolved Hide resolved
@CBroz1
Copy link
Contributor Author

CBroz1 commented Aug 12, 2024

Thanks for your review @ethho!

my_part_table.delete(force_parts=False, force_masters=True) will delete the master table, even though force_parts=False.

Given that this is currently the default behavior, should force_parts=True turn off force_masters, or log a warning? Otherwise, in order to get the previous force_parts=True behavior, users will need to set force_parts=True, force_masters=False.

@ethho
Copy link
Contributor

ethho commented Aug 13, 2024

Given that this is currently the default behavior, should force_parts=True turn off force_masters, or log a warning?

@CBroz1 Based on my reading, force_masters=True always implies force_parts=True, since master is always in deleted in this expression when force_masters=True. force_masters=False restores the behavior before this PR, where the value of force_parts is relevant.

Then we should document the behavior well and keep the logic as is, without raising a warning. A warning would get very annoying as the default behavior. One option that I don't endorse is: change default behavior to force_masters=True, force_parts=True, but this would result in surprising behavior for users accustomed to delete() who refactor to delete(force_masters=False) in the new version.

@CBroz1
Copy link
Contributor Author

CBroz1 commented Aug 13, 2024

Thanks @ethho, sounds good. I made note of this in the changelog.

If we could get this into a 14.2 release, my team would really benefit both from this PR as well as #1106. Happy to make adjustments here if that makes sense, but I understand if there are other features you're targeting for inclusion in 14.2

@ethho
Copy link
Contributor

ethho commented Aug 13, 2024

If we could get this into a 14.2 release, my team would really benefit both from this PR

In my opinion, this change is a large enough to be in the major release 0.15 instead of 0.14.2. The preferred approach for us would be a pre-release 0.15.dev0. Would this work for you? Note that datajoint will continue to point to datajoint==0.14.1 as the latest, but your team could explicitly set datajoint==0.15.dev0.

@CBroz1
Copy link
Contributor Author

CBroz1 commented Aug 13, 2024

Sure, that works for me, thanks! Should I make any edits to the changelog to reflect the pre-release?

CHANGELOG.md Outdated Show resolved Hide resolved
Co-authored-by: Ethan Ho <[email protected]>
@ethho
Copy link
Contributor

ethho commented Aug 13, 2024

For anyone urgently blocked on 0.15.dev0, note that you can always pip install git+: https://stackoverflow.com/a/20101940

datajoint/table.py Outdated Show resolved Hide resolved
datajoint/table.py Outdated Show resolved Hide resolved
tests/test_cascading_delete.py Outdated Show resolved Hide resolved
@CBroz1 CBroz1 requested a review from ethho August 18, 2024 23:31
@ethho ethho merged commit 5a6077f into datajoint:master Aug 19, 2024
11 checks passed
@ethho
Copy link
Contributor

ethho commented Aug 22, 2024

@CBroz1 @edeno @lfrank This PR was included in release 0.4.2 built by CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants