-
Notifications
You must be signed in to change notification settings - Fork 28.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45533][CORE] Use j.l.r.Cleaner instead of finalize for RocksDBIterator/LevelDBIterator #43502
Conversation
cc @LuciferYang |
also cc @mridulm @beliefer @srowen @dongjoon-hyun |
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
Although the current pr only fixes part of RocksDB now, I suggest adding the LevelDB part in this PR once the RocksDB part is deemed feasible. I prefer to complete them in one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we finish the tests?
ok, will add LevelDB part later. |
I haven’t thought of a good testing method yet, do you have any suggestions? |
Firstly, we may need to add two helper methods:
Then, the test case may like the following: @Test
public void testResourceCleaner() throws Exception {
File dbPathForCleanerTest = File.createTempFile(
"test_db_cleaner.", ".rdb");
dbPathForCleanerTest.delete();
RocksDB dbForCleanerTest = new RocksDB(dbPathForCleanerTest);
for (int i = 0; i < 8192; i++) {
dbForCleanerTest.write(createCustomType1(i));
}
RocksDBIterator<CustomType1> rocksDBIterator = (RocksDBIterator<CustomType1>) dbForCleanerTest.view(CustomType1.class).iterator();
Optional<Reference<RocksDBIterator<?>>> referenceOpt = dbForCleanerTest.iteratorReference(rocksDBIterator);
assertTrue(referenceOpt.isPresent());
RocksIterator it = rocksDBIterator.internalIterator();
assertTrue(it.isOwningHandle()); // it has not been closed yet, isOwningHandle should be true.
rocksDBIterator = null; // Manually set rocksDBIterator to null, to be GC.
Reference<RocksDBIterator<?>> ref = referenceOpt.get();
// 100 times gc, the rocksDBIterator should be GCed.
int count = 0;
while (count < 100 && !ref.refersTo(null)) {
System.gc();
count++;
Thread.sleep(100);
}
assertTrue(ref.refersTo(null)); // check rocksDBIterator should be GCed
// Verify that the Cleaner will be executed after a period of time, and it.isOwningHandle() will become false.
assertTimeout(java.time.Duration.ofSeconds(5), () -> assertFalse(it.isOwningHandle()));
dbForCleanerTest.close();
} Of course, we can also add a state in |
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
Adding a state is simpler and works for different classes, but I don't know if adding variables for testing is a common approach. |
Let's add a test case first. |
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/test/java/org/apache/spark/util/kvstore/RocksDBIteratorSuite.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDB.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Show resolved
Hide resolved
common/kvstore/src/main/java/org/apache/spark/util/kvstore/RocksDBIterator.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/test/java/org/apache/spark/util/kvstore/DBIteratorSuite.java
Outdated
Show resolved
Hide resolved
common/kvstore/src/test/java/org/apache/spark/util/kvstore/RocksDBIteratorSuite.java
Outdated
Show resolved
Hide resolved
// Verify that the Cleaner will be executed after a period of time, and it.isOwningHandle() will become false. | ||
assertTimeout(java.time.Duration.ofSeconds(5), () -> assertFalse(it.isOwningHandle())); | ||
|
||
dbForCleanerTest.close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's move dbForCleanerTest.close()
to finally block and t would be better to add FileUtils.deleteQuietly(dbPathForCleanerTest);
after close in finally block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it's fine as long as we can ensure there are no residues after testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, if remove it, there is an exception.
org.rocksdb.RocksDBException: `/Users/zhaomin/Files/project/spark/common/kvstore/target/tmp/test_db_cleaner.6329921973463604622.rdb' exists but is not a directory
at org.rocksdb.RocksDB.open(Native Method)
at org.rocksdb.RocksDB.open(RocksDB.java:249)
at org.apache.spark.util.kvstore.RocksDB.<init>(RocksDB.java:122)
at org.apache.spark.util.kvstore.RocksDB.<init>(RocksDB.java:116)
return iteratorTracker; | ||
} | ||
|
||
@VisibleForTesting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm.. this method can be moved to RocksDBSuite.java as a private method due to RocksDB
already has getIteratorTracker(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
friendly ping @dongjoon-hyun @mridulm @beliefer Could you please take another look at this pr if you have time? Thanks ~ |
…elDBIterator.java Co-authored-by: YangJie <[email protected]>
…elDBIterator.java Co-authored-by: Mridul Muralidharan <[email protected]>
Merged into master for Spark 4.0. |
Thank you very much for your help. |
What changes were proposed in this pull request?
use java.lang.ref.Cleaner instead of finalize() for RocksDBIterator
Why are the changes needed?
The finalize() method has been marked as deprecated since Java 9 and will be removed in the future, java.lang.ref.Cleaner is the more recommended solution.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Pass actions and new tests
Was this patch authored or co-authored using generative AI tooling?
No