Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload is flaky #8131

Open
reta opened this issue Jun 17, 2023 · 8 comments
Assignees
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run Storage:Durability Issues and PRs related to the durability framework v2.15.0 Issues and PRs related to version 2.15.0

Comments

@reta
Copy link
Collaborator

reta commented Jun 17, 2023

Describe the bug
The org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload test is flaky :

java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 1 open files: {metadata__11__3__k7CAyYgBoomrakrqpTVg=1}
	at __randomizedtesting.SeedInfo.seed([245CC13AEFCE68E1:4A2CAE292B510978]:0)
	at org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:876)
	at org.apache.lucene.store.FilterDirectory.close(FilterDirectory.java:111)
	at org.apache.lucene.store.FilterDirectory.close(FilterDirectory.java:111)
	at org.opensearch.index.store.Store$StoreDirectory.innerClose(Store.java:990)
	at org.opensearch.index.store.Store.closeInternal(Store.java:554)
	at org.opensearch.index.store.Store$1.closeInternal(Store.java:194)
	at org.opensearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:78)
	at org.opensearch.index.store.Store.decRef(Store.java:529)
	at org.opensearch.index.store.Store.close(Store.java:536)
	at org.opensearch.common.util.io.IOUtils.close(IOUtils.java:89)
	at org.opensearch.common.util.io.IOUtils.close(IOUtils.java:131)
	at org.opensearch.common.util.io.IOUtils.close(IOUtils.java:81)

To Reproduce

./gradlew ':server:test' --tests "org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload" -Dtests.seed=245CC13AEFCE68E1 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=mt-MT -Dtests.timezone=MST7MDT -Druntime.java=17

Expected behavior
Test must always pass

Plugins
Standard

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • CI

Additional context
https://build.ci.opensearch.org/job/gradle-check/17846/

@reta reta added bug Something isn't working flaky-test Random test failure that succeeds on second run labels Jun 17, 2023
@sachinpkale sachinpkale added Storage:Durability Issues and PRs related to the durability framework v2.9.0 'Issues and PRs related to version v2.9.0' and removed untriaged labels Jun 19, 2023
@sachinpkale
Copy link
Member

@linuxpi Can you please look into this?

@linuxpi
Copy link
Collaborator

linuxpi commented Jun 19, 2023

Sure, will take a look

@linuxpi
Copy link
Collaborator

linuxpi commented Jul 11, 2023

Failure not reproduce-able even with 2000 iterations

./gradlew ':server:test' --tests "org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload" -Dtests.seed=245CC13AEFCE68E1 -Dtests.security.manager=true -Dtests.jvm.argline="-XX:TieredStopAtLevel=1 -XX:ReservedCodeCacheSize=64m" -Dtests.locale=mt-MT -Dtests.timezone=MST7MDT -Druntime.java=17 -Dtests.iters=2000

@reta
Copy link
Collaborator Author

reta commented Jul 11, 2023

@linuxpi just from today's CI builds [1]:

org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload

java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 1 open files: {metadata__9223372036854775788__9223372036854775804__9223372036854775805__9223372036854775804__9223370347789126145__1=1}
	at __randomizedtesting.SeedInfo.seed([42912DBBDDFA3EF5:2CE142A819655F6C]:0)
	at org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:876)
	at org.apache.lucene.store.FilterDirectory.close(FilterDirectory.java:111)
	at org.apache.lucene.store.FilterDirectory.close(FilterDirectory.java:111)
	at org.opensearch.index.store.Store$StoreDirectory.innerClose(Store.java:1001)
	at org.opensearch.index.store.Store.closeInternal(Store.java:554)
	at org.opensearch.index.store.Store$1.closeInternal(Store.java:194)
	at org.opensearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:78)
	at org.opensearch.index.store.Store.decRef(Store.java:529)</pre>org.opensearch.index.shard.RemoteStoreRefreshListenerTests.testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload
Failing for the past 1 build (Since
[#19825](https://build.ci.opensearch.org/job/gradle-check/19825/) )
[Took 1.6 sec.](https://build.ci.opensearch.org/job/gradle-check/19825/testReport/junit/org.opensearch.index.shard/RemoteStoreRefreshListenerTests/testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload/history)
Error Message

java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 1 open files: {metadata__9223372036854775788__9223372036854775804__9223372036854775805__9223372036854775804__9223370347789126145__1=1}

Stacktrace

java.lang.RuntimeException: MockDirectoryWrapper: cannot close: there are still 1 open files: {metadata__9223372036854775788__9223372036854775804__9223372036854775805__9223372036854775804__9223370347789126145__1=1}
	at __randomizedtesting.SeedInfo.seed([42912DBBDDFA3EF5:2CE142A819655F6C]:0)
	at org.apache.lucene.tests.store.MockDirectoryWrapper.close(MockDirectoryWrapper.java:876)
	at org.apache.lucene.store.FilterDirectory.close(FilterDirectory.java:111)
	at org.apache.lucene.store.FilterDirectory.close(FilterDirectory.java:111)
	at org.opensearch.index.store.Store$StoreDirectory.innerClose(Store.java:1001)
	at org.opensearch.index.store.Store.closeInternal(Store.java:554)
	at org.opensearch.index.store.Store$1.closeInternal(Store.java:194)
	at org.opensearch.common.util.concurrent.AbstractRefCounted.decRef(AbstractRefCounted.java:78)
	at org.opensearch.index.store.Store.decRef(Store.java:529)

[1] https://build.ci.opensearch.org/job/gradle-check/19825/testReport/junit/org.opensearch.index.shard/RemoteStoreRefreshListenerTests/testRefreshSuccessAfterFailureInFirstAttemptAfterSnapshotAndMetadataUpload/

@sachinpkale
Copy link
Member

Taking a look.

@sachinpkale
Copy link
Member

This issue is fixed with: bc7a3ee

@ashking94
Copy link
Member

Reopening this as this test failed in #12607.

@github-project-automation github-project-automation bot moved this to Planned work items in Test roadmap format Apr 22, 2024
@BhumikaSaini-Amazon BhumikaSaini-Amazon added v2.9.0 'Issues and PRs related to version v2.9.0' and removed v2.9.0 'Issues and PRs related to version v2.9.0' labels Apr 22, 2024
@gbbafna gbbafna assigned gbbafna and unassigned linuxpi May 3, 2024
@Bukhtawar Bukhtawar added v2.15.0 Issues and PRs related to version 2.15.0 and removed v2.14.0 labels May 16, 2024
@shourya035
Copy link
Member

@gbbafna Please update the release target or close this issue if the corresponding PR is merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working flaky-test Random test failure that succeeds on second run Storage:Durability Issues and PRs related to the durability framework v2.15.0 Issues and PRs related to version 2.15.0
Projects
Status: Now(This Quarter)
Status: Planned work items
Development

No branches or pull requests

9 participants