-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] 2.15 using qat_deflate with default docker image crashes node because of missing library #168
Comments
@sandervandegeijn Did you get a chance to dig into this? Might be a 2.16 showstopper. |
Any suggestions on how to provide more useful info? The only thing I need to do is to create a new index with the codec and reindex the data. As soon as I hit send the node crashes. Dockerfile with which we extend the base image with the Azure plugin (sorry, we use S3 on the other cluster, this one uses Azure blob storage).
It looks like a CI/CD packaging / path problem to me at first glance. I hit another thing while testing the compression codecs, using zlib actually increases my index size vs no codec specified. Still investigating that one. |
Hi @sandervandegeijn , Would you give us more details on Could you test again on the tarball artifact? Also sync @reta @sarthakaggarwal97 into the discussion since this looks like a opensearch-project/custom-codecs issue. Thanks. |
I was not able to reproduce on the default distribution of 2.15.
|
The reason I suspect a packaging problem is the error:
Looks like the lib is missing or in the wrong path. @dblock have you tried doing a reindex to that newly created index? That's when the error occurs on my cluster. Could it otherwise be that you have the lib on your system and it's linking dynamically? |
@sandervandegeijn yes, sorry, I forgot to copy-paste the last part, works on my machine |
Should we also look at ways to disable this codec, while we figure out a way to actually get a fix in place. Moreover, since this codec is anyway not working since 2.14, if we can throw 4xx instead to prevent crash? |
Let me do some testings directly on the docker release image. Thanks. |
I tried to add a document to my index and it crashed the node.
|
That seems like a different issue: |
That seems like a different issue: |
Confirmed this one on a Mac with Apple Silicon running the image under docker desktop as well. Didn't expect that error either, from the docs: it should fall back to the software implementation instead of the hardware accelerated one, but it should still work. Seperate bug? |
I just tried using the docker images and I dont see the issue as well:
|
Yeah that probably is a separate issue because in custom-codecs we see this
It seems like it does not support arm64 at this point. |
Related to this PR in custom-codecs repo Transfer to there and adding @reta @sarthakaggarwal97 @andrross to take a look. Thanks. |
Still it should fall back to a pure software implementation right? Should I open a seperate issue in the custom-codecs repo? The cluster runs on: Intel(R) Xeon(R) Gold 6242R CPU @ 3.10GHz so that should work. |
These look like similar issues...the QAT codecs do not gracefully handle the case where it cannot be loaded (either due to incompatible hardware or missing library). I would propose the following options for the 2.16 release:
|
@andrross thanks for sharing the options. |
Make sure to have a document in the index.
|
@andrross Why isn't there a catch all in |
Able to reproduce:
|
If the qat codec is unavailable, I doubt the shards will be green. The users can change the codec before upgrading (force merge to 1 segment, so that all the segment has a old / stable codec), and then upgrade.
I think this is what we wanted always, but we couldn't come up with the consensus on how to mark these codecs experimental. Discussions over here opensearch-project/OpenSearch#13992 I'm good with 3rd option, if we have can come up with a mitigation plan to fix the codecs. If we do not see that happening soon, I will vote for 2nd option. |
@dblock These are |
adding @mulugetam to the discussion (contributor for QAT codec) |
Hi @sarthakaggarwal97 @mulugetam , Do we know if user needs to explicitly install Thanks. |
But this is |
I agree, we shouldn't return the codecs if they are not supported in that platform.
Not a requirement for 2.16. But, we should implement this sooner, today the installation of custom-codes gets all the codecs in it causing issues like this. |
@dblock Yes, @sarthakaggarwal97 has implemented something like this. But I don't think we should have a general catch-all in the core layer. |
One thing to note here is that I believe from reading the EC2 documentation that only the |
Fixed by #169. Closing. |
Thanks guys |
Describe the bug
Trying to use the qat_deflate compression. According to the docs it should be there from 2.14 on. This is not correct btw, in 2.14.0 it can't be used, gives an instant error. Created a PR for the docs for that one.
In 2.15 is does not throw the error of not supporting the codec but it does crash the node.
Related component
Storage
To Reproduce
Then reindex:
Expected behavior
Do not crash
Additional Details
Base 2.15.0 docker image with the s3 plugin installed.
Log:
The text was updated successfully, but these errors were encountered: