Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Synchronise 2023.1 with upstream #121

Merged
merged 9 commits into from
Oct 21, 2024
Merged

Conversation

github-actions[bot]
Copy link

This PR contains a snapshot of 2023.1 from upstream stable/2023.1.

melwitt and others added 9 commits July 7, 2023 18:07
The regexes in test_archive_deleted_rows for multiple cells were
incorrect in that they were not isolating the search pattern and rather
could match with other rows in the result table as well, resulting in a
false positive.

This fixes the regexes and also adds one more server to the test
scenario in order to make sure archive_deleted_rows iterates at least
once to expose bugs that may be present in its internal iteration.

This patch is in preparation for a future patch that will change the
logic in archive_deleted_rows. Making this test more robust will more
thoroughly test for regression.

Change-Id: If39f6afb6359c67aa38cf315ec90ffa386d5c142
(cherry picked from commit f6620d4)
Previously, we archived deleted rows in batches of max_rows parents +
their child rows in a single database transaction. Doing it that way
limited how high a value of max_rows could be specified by the caller
because of the size of the database transaction it could generate.

For example, in a large scale deployment with hundreds of thousands of
deleted rows and constant server creation and deletion activity, a
value of max_rows=1000 might exceed the database's configured maximum
packet size or timeout due to a database deadlock, forcing the operator
to use a much lower max_rows value like 100 or 50.

And when the operator has e.g. 500,000 deleted instances rows (and
millions of deleted rows total) they are trying to archive, being
forced to use a max_rows value several orders of magnitude lower than
the number of rows they need to archive was a poor user experience.

This changes the logic to archive one parent row and its foreign key
related child rows one at a time in a single database transaction
while limiting the total number of rows per table as soon as it reaches
>= max_rows. Doing this will allow operators to choose more predictable
values for max_rows and get more progress per invocation of
archive_deleted_rows.

Closes-Bug: #2024258

Change-Id: I2209bf1b3320901cf603ec39163cf923b25b0359
(cherry picked from commit 697fa3c)
Today if the write sys call to offline a cpu when
deleting an instnace fails due to an OSERROR or ValueERROR
the instance delete fails and the instance goes to error.

as reported in bug: #2065927 this can happen as a result of
OSError: [Errno 16] Device or resource busy if the vm is
deleted shortly after its started.

Related-Bug: #2065927
Change-Id: I1352a3a1e28cfe14ec8f32042ed35cb25e70338e
(cherry picked from commit ee581a5)
(cherry picked from commit f1c4680)
(cherry picked from commit 254ea7b)
This change adds a retry_if_busy decorator
to the read_sys and write_sys functions in the filesystem
module that will retry reads and writes up to 5 times with
an linear backoff.

This allows nova to tolerate short periods of time where
sysfs retruns device busy. If the reties are exausted
and offlineing a core fails a warning is log and the failure is
ignored. onling a core is always treated as a hard error if
retries are exausted.

Closes-Bug: #2065927
Change-Id: I2a6a9f243cb403167620405e167a8dd2bbf3fa79
(cherry picked from commit 44c1b48)
(cherry picked from commit 1581f66)
(cherry picked from commit 6a475ac)
Change-Id: Id08761918ccc3477a39d9c778f3ac92679c22511
(cherry picked from commit fca941a)
(cherry picked from commit ea0f46f)
(cherry picked from commit 2c98f91)
(cherry picked from commit ed7371f)
@priteau priteau enabled auto-merge October 21, 2024 08:13
@priteau priteau merged commit 1c8031b into stackhpc/2023.1 Oct 21, 2024
3 checks passed
@priteau priteau deleted the upstream/2023.1-2024-10-21 branch October 21, 2024 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants