Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable autoPing in WS tests #3944

Closed
wants to merge 1 commit into from
Closed

Conversation

kciesielski
Copy link
Member

Sometimes WebSocket tests fail, because the default auto ping is sent to the client, which then tries to send back a pong, while the backend has already closed the connection.
This PR ensures that autoPing is enabled only in the test which actually tests pings.

@kciesielski kciesielski marked this pull request as ready for review July 19, 2024 08:32
@kciesielski kciesielski requested a review from adamw July 19, 2024 08:32
@adamw
Copy link
Member

adamw commented Jul 19, 2024

Does this mean that ping frames are sent after the connection is properly closed by exchanging close frames? I'm wondering if this is some problem in our implementation, or is it just scoped to tests?

@kciesielski
Copy link
Member Author

I assumed they are sent right before the Close response, and that the issue is on the client side, because it should ignore it after sending its Close request. But you are right - it may be the backend which has a bug and sends ping when it shouldn't, so I'll verify this. It may be both as well ;)

@kciesielski
Copy link
Member Author

I investigated the subject, and:

  1. Backends won't send ping after sending close
  2. They can still send ping right before sending close, as ping flows are paralle. Tests often wait for close response using:
ws.eitherClose(ws.receiveText())

and receiveText() is in fact receiveText(pongOnPing = true), so the client tries to send Pong and it fails internally:

[info]   Cause: java.io.IOException: Output closed
[info]   at java.net.http/jdk.internal.net.http.websocket.MessageEncoder.encodePong(MessageEncoder.java:301)
[info]   at java.net.http/jdk.internal.net.http.websocket.TransportImpl$SendTask$1.onPong(TransportImpl.java:404)
[info]   at java.net.http/jdk.internal.net.http.websocket.TransportImpl$SendTask$1.onPong(TransportImpl.java:367)

I think this should be improved in sttp -> sending Ping/Pong should be silently skipped just like sending another Close when webSocket.isClosed(). From the protocol perspective it's not really a violation, rather a gray area that's not well specified.

@adamw
Copy link
Member

adamw commented Jul 21, 2024

So I think that's a definitely good catch, but it would also be good to properly fix this, as by disabling auto-pings we won't catch such problems in the future.

But I'm wondering what's the proper fix here ;) I think that first of all, after a channel is closed, we shouldn't send any more pings. Does this happen only in "our" Netty servers, or others as well? I think in Netty the problem might be fixed by removing the WebSocketAutoPingHandler from the pipeline, after the channel is determined to be closed in wrapSubscriberWithNettyCallback, .onComplete or .onError (btw, if there's an IOException, what's the point of sending a close frame - if we know that the channel is closed?)

As for sttp, take a look at: softwaremill/sttp#2236 - is that what you meant?

@kciesielski
Copy link
Member Author

after a channel is closed, we shouldn't send any more pings
As far as I checked, we don't do that. As soon as a backend processes a Close and sends a Close response, no more pings will follow this response.
That's why I believe it's to be fixed only in sttp, I'll follow-up in the linked issue.

@adamw
Copy link
Member

adamw commented Jul 22, 2024

@kciesielski ah, we don't send any more pings because the "reactive stream" is complete, and it's being closed in Netty? Or is there another reason? I'm not quite able to pinpoint this in code.

@kciesielski
Copy link
Member Author

I did some extra digging and there may be indeed a problem with current Netty impl to guarantee no pings after Close. All other backends seem ok, because they send pings in the same stream. Even if merged from two parallel streams, that stream is ended with something like .append(WebSocketFrame.Close) so this should guarantee that Close is indeed the last response.
For Netty, however, we start scheduled Pings totally separately like this:

ctx.channel().eventLoop().scheduleAtFixedRate(sendPing, pingInterval.toMillis, pingInterval.toMillis, TimeUnit.MILLISECONDS)

pingTask = ctx.channel().eventLoop().scheduleAtFixedRate(sendPing, 
  pingInterval.toMillis, pingInterval.toMillis, TimeUnit.MILLISECONDS)

and stop this in a handler:

  override def channelInactive(ctx: ChannelHandlerContext): Unit = {
    super.channelInactive(ctx)
    logger.debug(s"STOPPING WebSocket Ping scheduler for channel ${ctx.channel}")
    if (pingTask != null) {
      val _ = pingTask.cancel(false)
    }
  }

I think it's technically possible that after a reactive streams sends Close and before this channelInactive is triggered, the scheduled task may call sendPing. However, tests never failed on CI on netty backends for that particular reason, so I would register a low-priority issue to deal with this specific case.

@kciesielski
Copy link
Member Author

Closing, because https://github.com/softwaremill/tapir/pull/3968/files pulls softwaremill/sttp-shared#425, which should resolve the issue on the client level.

@kciesielski kciesielski deleted the fix-ws-http4s-test branch August 16, 2024 09:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants