-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Thread deadlock in LayerMetadataStore when writing parameters to a temp properties file #1276
Comments
I'm not sure it's the same, but this reminds me a lot of the issue fixed in this pull request. The fix has been released for the first time on 1.25.1 a few days ago, and should be part of the 1.24.x series next month. Also, I recommend using a nightly build if you're testing the Azure blob store, or the test could become expensive: #1149 (issue fixed in the meantime, but also released only on 1.25.1 so far) |
Thanks Andrea for your assistance. Concerning the 2 issues mentioned above (parameter storage for FileBlobStore, ListBlobs issue for AzureBlobStore), is there an intended 1.24.4 release date (so GeoServer 2.24.4 I guess), that these may be included in?
|
1.24.4 should be released around the 18th of the month. About the truncation being inefficient, we're aware, have some ideas, but waiting for funding to show up. If you're up to make pull requests on your own, I'm happy to explain some of the most immediate changes that would improve truncate performance. Classloader issues with plugins... are you using GWC along with GeoServer and with some community modules in the mix? On 1.25.x we just merged a rather large PR that overhauls how community modules are packaged, that should help in that respect: geoserver/geoserver#7714 |
Nice! Yes, I can potentially contribute. I'm waiting to see where we land in the next couple of weeks. We have to scale for an expected load increase imminently. We may end up with an interim solution and then a future plan. Yes, our setup is GWC embedded in GeoServer, with some community plugins. I believe my colleague may have asked about the ClassLoader issues, maybe on the mailing list. I need to catch up on that aspect. |
Some positive feedback on this one. With the previously mentioned fix for #880, in PR #1230, we now get substantial performance improvements with FileBlobStore. This is anywhere between 2x to 12x the performance, in terms of throughput (there are so many variables to our setup that I can't put a precise figure on it - but this, with other config tweaks to allow us to scale, brings us close to the 12x mark). I've tested this with both the 1.24.x nightly builds, and now 1.24.4 now that it's been released. It also seems that this thread deadlock should now not be possible, as I can see that the code path in question should no longer be traversed. I can't verify that 100% though, until we give this a solid hit-out for a decent time period in our prod env. |
LayerMetadataStore appears to sometimes encounter a thread deadlock when writing parameters to a temp properties file. The problem appears to be quite rare (~1 in every few million tile requests, which is once or twice a week for us in Production) but when it does occur, it will effectively bring the server to a halt.
Our setup:
Symptoms:
The issue is difficult to reproduce, so I’ve not included a test.
I do however have a Java Flight Recorder output which detected the deadlock. I’ll attempt to attach this here:
1e16e3f4984b4c37b3690c7369987cfb.jfr.zip
I can attempt to assist in a fix - but I can’t see precisely where the problem is. Below, I’ve done my best to highlight whereabouts the problem is. I’m seeking any thoughts/ideas. My best guess is that after obtaining a hash, it appears to use a potentially unsafe array of locks.
The Deadlock, per Java Flight Recorder
From the JFR output, the proof of the deadlock is (which by itself, Is not overly helpful):
Of which the relevant stack traces of both threads 13 and 44 are (i've not included the full traces, for brevity. They are visible in the JFR zip though):
Relevant sections in the code
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/FileBlobStore.java
Line 677 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 118 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 167 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 175 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 232 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 249 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 146 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 137 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 72 in e0244e1
geowebcache/geowebcache/core/src/main/java/org/geowebcache/storage/blobstore/file/LayerMetadataStore.java
Line 75 in e0244e1
Any thoughts welcome. In the meantime, we'll be trialling the Azure BlobStore plugin, to get around this issue.
The text was updated successfully, but these errors were encountered: