Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Only drop columns and indexes associated with materialized columns if they exist #26664

Merged
merged 9 commits into from
Dec 5, 2024

Conversation

tkaemming
Copy link
Contributor

@tkaemming tkaemming commented Dec 5, 2024

Problem

DROP actions still create mutations, even if those actions don't have anything to drop and these inconsequential mutations are slow to execute. This makes retrying a task unnecessarily slow if after a partial failure or timeout occurs.

Reducing the number of materialized columns we manage is part of #26651.

Changes

Only drop columns and indexes if they actually exist, otherwise skip these steps.

Since these tasks are run via ClickhouseCluster on specific hosts/shards, this should be resilient to partial failures on a per-shard basis.

Does this work well for both Cloud and self-hosted?

N/A, Cloud only

How did you test this code?

Added tests to ensure optimization works as expected, otherwise covered by existing lifecycle tests.

@tkaemming tkaemming marked this pull request as ready for review December 5, 2024 20:06
@tkaemming tkaemming requested a review from a team December 5, 2024 20:06
@tkaemming tkaemming merged commit f04d14c into master Dec 5, 2024
97 checks passed
@tkaemming tkaemming deleted the conditionally-drop-index branch December 5, 2024 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants