Retry updating conda packages #976

soapy1 · 2024-11-11T23:00:52Z

Fixes #852

Description

Th issue described in #852 arises because part of the environment build process is to execute the action_add_conda_prefix_package. This action tries to insert all the packages from the environment into the db. If these packages overlap with any of the packages inserted by task_update_conda_channel (which is common on a first run of the task) then we get the integrity error.

This PR adds a retry to the part of the task_update_conda_channel task that updates the conda_package table. So, if an integrity error occurs due to a package getting inserted by another process while a bulk update is happening, then the whole process for inserting the conda packages will try again. Subsequent bulk inserts are expected to work, since part of the update process is to only insert packages that haven't already been added.

Other options I looked out to resolve this issue are:

using an upsert statements instead of an insert statement
- sqlalchemy does not provide a platform agnostic way to do upserts. Since conda-store is meant to be db platform agnostic, I think this is not an great option.
briefly looked into locking the conda_package table to writes while the bulk insert is happening.
- I don't think this is a great option since it's pretty heavy handed and may have unintentional consequences as the system evolves.

Pull request checklist

Did you test this change locally?
Did you update the documentation (if required)?
Did you add/update relevant tests for this change (if required)?

netlify · 2024-11-11T23:01:06Z

✅ Deploy Preview for conda-store canceled.

Name	Link
🔨 Latest commit	`1510115`
🔍 Latest deploy log	https://app.netlify.com/sites/conda-store/deploys/67328ca67680630008d61efa

netlify · 2024-11-11T23:01:30Z

✅ Deploy Preview for conda-store canceled.

Name	Link
🔨 Latest commit	`c16061d`
🔍 Latest deploy log	https://app.netlify.com/sites/conda-store/deploys/67328fccf119570008a467c6

peytondmurray · 2024-11-14T01:05:50Z

conda-store-server/conda_store_server/_internal/utils.py

@@ -187,3 +188,26 @@ def compile_arn_sql_like(
        re.sub(r"\*", "%", match.group(1)),
        re.sub(r"\*", "%", match.group(2)),
    )
+
+
+def retry_on_errors(allowed_retries=1, on_errors=(), logger=None):


The reason we're seeing this issue is a race condition that happens between

The task that inserts packages found from the repodata.json that gets downloaded when a channel gets updated (task_update_conda_channel), and

The preexisting conda packages in the the conda-store prefixes (action_add_conda_prefix_packages)

It seems like making this decorator instantly retry the decorated function upon failure is just delaying one of the two competing tasks rather than solving the underlying race condition. The bulk insert above that gets retried seems like an optimization (that's why it's using bulk insert rather than package by package insertion), so is it possible that action_add_conda_prefix_packages wouldn't yet be done by the time the retry is made?

Yep, I think you're right. We could wait some time before retrying. But maybe it's worth thinking about it a little more to find a better solution 🤔

I was thinking of a few other options:

don't run the action_add_conda_prefix_packages if the task_update_conda_channel task is running. This option isn't so good because if the task fails, the package won't get added and we'll have no way know about the missing data.

when updating channels, check and insert each package instead of doing a bulk insert. This would put a lot more load on the database. This will definitely make things less efficient. We are inserting on the order of 10000's packages per architecture. Probably not a good option

soapy1 added the type: bug 🐛 Something isn't working label Nov 11, 2024

Retry update conda packages

99f1ad4

soapy1 force-pushed the retry-update-conda-packages branch from e8c95c9 to 99f1ad4 Compare November 11, 2024 23:14

[pre-commit.ci] Apply automatic pre-commit fixes

c16061d

soapy1 marked this pull request as ready for review November 12, 2024 05:30

soapy1 added the needs: review 👀 label Nov 12, 2024

peytondmurray self-requested a review November 14, 2024 00:52

peytondmurray reviewed Nov 14, 2024

View reviewed changes

soapy1 marked this pull request as draft November 15, 2024 15:53

soapy1 removed the needs: review 👀 label Nov 15, 2024

peytondmurray mentioned this pull request Nov 18, 2024

Update conda-package-build causes an integrity error on conda_package_build table #961

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry updating conda packages #976

Retry updating conda packages #976

soapy1 commented Nov 11, 2024 •

edited

Loading

netlify bot commented Nov 11, 2024 •

edited

Loading

netlify bot commented Nov 11, 2024 •

edited

Loading

peytondmurray Nov 14, 2024

soapy1 Nov 15, 2024

soapy1 Nov 15, 2024

Retry updating conda packages #976

Are you sure you want to change the base?

Retry updating conda packages #976

Conversation

soapy1 commented Nov 11, 2024 • edited Loading

Description

Pull request checklist

netlify bot commented Nov 11, 2024 • edited Loading

✅ Deploy Preview for conda-store canceled.

netlify bot commented Nov 11, 2024 • edited Loading

✅ Deploy Preview for conda-store canceled.

peytondmurray Nov 14, 2024

Choose a reason for hiding this comment

soapy1 Nov 15, 2024

Choose a reason for hiding this comment

soapy1 Nov 15, 2024

Choose a reason for hiding this comment

soapy1 commented Nov 11, 2024 •

edited

Loading

netlify bot commented Nov 11, 2024 •

edited

Loading

netlify bot commented Nov 11, 2024 •

edited

Loading