-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructuring the SSL event handling #228
base: master
Are you sure you want to change the base?
Conversation
always try to write if there is something in the buffer, always try to read remember why SSL failed to use as monitoring flags - and filter in process()
…nt the return value
… called monitor->events
Also refactors redundant code into a helper function
Could we maybe get some feedback to this PR? I'm also wondering how this relates to 80ce632 |
There were indeed problems in handling SSL connections. The changes in 80ce632 were supposed to fix this. Does the issue still exist? |
We have not tested 80ce632 because we have been running this PR's code-base for the last 5 months. I don't think 80ce632 covers all problematic cases. The state handling is still messy. Consider the example that we observed failing:
I can see this still happening as there is no guarantee that the read will fully succeed just because the socket was actually readable (as opposed to wrongly cached readable). Overall I think running your own checks on the socket is the wrong way to go if you are in an async loop. This introduces possibilities for and redundancy, as well as many redundant syscalls and thus performance issues. That said the blocking |
@EmielBruijntjes We took a look at the recent b81bc34. It seems that it still uses the limited states which can end up inconsistent. While reproducing issues with this is extremely difficult, I believe there is a clear argument for the code. We are happy and would appreciate to discuss this further. |
I do agree that the earlier implementation of the SslConnected class was not correct. But since the latest changes I have the impression that it does work. Can you explain which (edge) cases we are missing with the current implementation? |
The case I am concerned about is the following:
|
About point 7: does this indeed happen? I though that SSL_ERROR_WANT_READ is only returned when the read-operation should be repeated (because it was only partially ready). If the read-operation fails because no data is available, it returns 0 and goes back to the idle state. |
If Also:
So I don't see a way to fall back to regular idle with 0 bytes received. |
About step 10: is that permitted? If SSL_ERROR_WANT_READ was returned by an earlier call, is it then allowed to call ssl_write() instead of repeating the ssl_read() operation? |
I don't think it is permitted to mix ssl_read() and ssl_write() calls if one of them returned WANT_READ or WANT_WRITE (check https://stackoverflow.com/questions/35138931/can-you-interleave-calls-to-ssl-read-ssl-write-when-one-of-them-returns-ss for example). As far as I see, the only option then is to repeat the call. The scenario that you are painting is, simplified, as follows:
This is either an (unlikely) bug in openssl, or openssl already deals with this and this scenario does not occur. |
I think the StackOverflow answer is wrong, but there is also a link to the mailing list which discusses a slight variation scenario I am concerned about:
In AMQP-CPP, step 1) would cause a repeated One point regarding this PR is that the want-flags should be reset to their default whenever any of the functions has received new want flags. I somehow think the default want-flags for Unfortunately the official documentation is indeed not very helpful. There is one paragraph in
The second sentence is rather confusing - I suspect it means "it should eventually be called again". It doesn't say whether it is safe to call |
There's also this paragraph in
One might read into this
|
Thanks for all your input. I find it very interesting because it would mean that I made some mistakes in other software too. But there are still a lot of unanswered questions here:
|
I don't have authoritative answers, just my interpretation of the documentation / mailing list.
|
My colleague has been further looking into this. We now think that it could indeed be possible to mix calls to ssl_read() and ssl_write(), but we do have to check it more closely to be sure. We also discovered that although the openssl 1.0 docs says that a repeated ssl_write() must be called with exactly the same parameters as the previous call (our interpretation: with the same memory addresses), that the call can in reality also be repeated with the same data, but stored on a different location. This would allow us to copy buffers only when a repeat is necessary, instead of prematurely before each call just in case the call fails. Anyway, there is some room for improvement in the SSL handling, because we no longer have to delay writing if a read operation is also in progress, and the other way around. And we do not have to buffer all outgoing data, but only the data that could not be sent at the first attempt. Whether the current implementation is now also broken, is not yet sure. |
As per the OpenSSL documentation:
The previous implementation considered this only partially and inconsistently. Particularly a
write
could call aread
when it was caused by areadable
event, but then theread
failed and changed the state toreading
and no more writing was done.This PR simplifies the whole read/write handling. Basically the first read/write is always attempted¹ to find out what SSL wants (
SSL_ERROR_WANT_READ/WRITE
). This is saved in the state and used for fd monitoring. For the nextprocess()
, these flags are used as filters.read
,write
, andproceed
is always invoked directly fromprocess
.¹ The constructor does not attempt read/write, it sets the
want_flags
toreadable | writable
instead to watch and react to any kind of fd events.Fixes #207 . This is only weakly tested with ASIO and libev. It needs more test & review.
Thanks to @bmario @kinnarr