-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flakiness on TestFilestreamMetadataUpdatedOnRename #42213
Fix flakiness on TestFilestreamMetadataUpdatedOnRename #42213
Conversation
For some reason this test became flaky, the root of the flakiness is not on the test, it is on how a rename operation is detected. Even though this test uses `os.Rename`, it does not seem to be an atomic operation. https://www.man7.org/linux/man-pages/man2/rename.2.html does not make it clear whether 'renameat' (used by `os.Rename`) is atomic. On a flaky execution, the file is actually perceived as removed and then a new file is created, both with the same inode. This happens on a system that does not reuse inodes as soon they're freed. Because the file is detected as removed, it's state is also removed. Then when more data is added, only the offset of the new data is tracked by the registry, causing the test to fail. A workaround for this is to not remove the state when the file is removed, hence `clean_removed: false` is set in the test config.
Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane) |
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
|
|
Proposed commit message
For some reason this test became flaky, the root of the flakiness is not on the test, it is on how a rename operation is detected. Even though this test uses
os.Rename
, it does not seem to be an atomic operation. https://www.man7.org/linux/man-pages/man2/rename.2.html does not make it clear whether 'renameat' (used byos.Rename
) is atomic.On a flaky execution, the file is actually perceived as removed and then a new file is created, both with the same inode. This happens on a system that does not reuse inodes as soon they're freed. Because the file is detected as removed, it's state is also removed. Then when more data is added, only the offset of the new data is tracked by the registry, causing the test to fail.
A workaround for this is to not remove the state when the file is removed, hence
clean_removed: false
is set in the test config.Checklist
I have made corresponding changes to the documentationI have made corresponding change to the default configuration filesI have added an entry inCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.## Disruptive User Impact## Author's ChecklistHow to test this PR locally
Run
TestFilestreamMetadataUpdatedOnRename
and ensure it does not failRelated issues
I first saw this test failing on #41954, however the issue/fix does not seem related to the PR, so I'm creating a PR with the standalone fix so it can be easily backported to 8.x
Buildkite failure
## Use cases## Screenshots## Logs