Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore NotConnected error in poll_shutdown() #42

Merged
merged 1 commit into from
Mar 5, 2024
Merged

Conversation

djc
Copy link
Member

@djc djc commented Feb 10, 2024

Fixes #41.

@djc djc requested review from cpu, ctz and quininer February 10, 2024 09:38
Copy link
Member

@cpu cpu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems reasonable to me based on what you found in #41

WDYT about adding some brief context to the commit message?

@djc
Copy link
Member Author

djc commented Feb 10, 2024

Added some context to the commit message.

@TCROC
Copy link

TCROC commented Feb 12, 2024

I ran into this issue as well with reqwest. This commit definitely reduces the amount of errors that I get, but it doesn't fix it completely. When testing hitting https://google.com 100 times, I actually start getting different responses than I get with openssl AND I occasionally get this error now:

connection error: unexpected end of file

I suspect that maybe EOF errors need to be ignored as well?

@TCROC
Copy link

TCROC commented Feb 12, 2024

These are the 3 responses I get:

openssl only returns this one:

Got 220 bytes. Decoded: <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
<A HREF="https://www.google.com/">here</A>.
</BODY></HTML>

rust-tls sometimes returns this one as well as the one above

Got 355 bytes. Decoded: <HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="https://www.google.com/sorry/index?continue=https://google.com/&amp;q=EgRhcBsYGOKZqa4GIin3B1VnSaLgXbzODBl2s8IZC52WjM_OqScOp542rQctJkRTYJ3_e73g8jIGPmpjbmRyWgFD">here</A>.
</BODY></HTML>

And then still occasionally returns:

error sending request for url (https://google.com/): connection error: unexpected end of file

@cpu
Copy link
Member

cpu commented Feb 12, 2024

I suspect that maybe EOF errors need to be ignored as well?
..
And then still occasionally returns:
error sending request for url (https://google.com/): connection error: unexpected end of file

I think this is being addressed upstream (hyperium/hyper#3427, hyperium/h2#743) and is separate from the concerns of this branch. See also the Rustls manual's section on Unexpected EOF.

The other two responses seem fine to me. The choice of the server's 301 or 302 response doesn't seem obviously tied to Rustls.

@TCROC
Copy link

TCROC commented Feb 12, 2024

I suspect that maybe EOF errors need to be ignored as well?
..
And then still occasionally returns:
error sending request for url (https://google.com/): connection error: unexpected end of file

I think this is being addressed upstream (hyperium/hyper#3427, hyperium/h2#743) and is separate from the concerns of this branch. See also the Rustls manual's section on Unexpected EOF.

The other two responses seem fine to me. The choice of the server's 301 or 302 response doesn't seem obviously tied to Rustls.

I added h2 as a dependency and pointed it to the git repo:

h2 = { git = "https://github.com/hyperium/h2.git" }

But I still get the error:

Got an error: error sending request for url (https://google.com/): connection error: unexpected end of file

@TCROC
Copy link

TCROC commented Feb 12, 2024

These are my reqwest dependencies:

reqwest = { version = "0.11", default-features = false, features = [
    "rustls-tls",
] }
tokio = { version = "1", default-features = false }
tokio-rustls = { git = "https://github.com/rustls/tokio-rustls.git", rev = "8c9d6f32b972a3c3882a304f8e5156034805ed43" }
h2 = { git = "https://github.com/hyperium/h2.git" }

@cpu
Copy link
Member

cpu commented Feb 12, 2024

Wouldn't you need to point your reqwest dependency at a fork that also takes a patch to use h2's unreleased git branch with the change? Does cargo tree reflect what you expect w.r.t the patched h2 dependency? I suspect you ended up with two h2 versions and reqwest using h2 0.3.14

Since this seems unrelated to the change in this PR I think it would be more appropriate to discuss in a separate issue.

@TCROC
Copy link

TCROC commented Feb 12, 2024

Wouldn't you need to point your reqwest dependency at a fork that also takes a patch to use h2's unreleased git branch with the change? Does cargo tree reflect what you expect w.r.t the patched h2 dependency? I suspect you ended up with two h2 versions and reqwest using h2 0.3.14

Since this seems unrelated to the change in this PR I think it would be more appropriate to discuss in a separate issue.

Yep I agree. I actually wasn't able to test updating h2 alongside this fix as there were compiler errors when I cloned reqwest. I'm just going to stick to using openssl for now and will circle back around to rustls when reqwest updates with the patches :).

Thanks for the help! :)

@djc
Copy link
Member Author

djc commented Feb 13, 2024

@quininer @ctz do you want to/have time to take a look at this?

Copy link
Member

@quininer quininer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for delay, I just finished my Spring Festival vacation.

A more reasonable approach is to record a state to ensure that write_io is no longer called after call io.shutdown. but that's fine too.

ready!(self.write_io(cx))?;
match ready!(self.write_io(cx)) {
Ok(_) => {}
Err(err) if err.kind() == io::ErrorKind::NotConnected => {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe it should break? otherwise it could loop infinitely.

@djc
Copy link
Member Author

djc commented Feb 20, 2024

Sorry for delay, I just finished my Spring Festival vacation.

No worries, hope you had a nice vacation!

A more reasonable approach is to record a state to ensure that write_io is no longer called after call io.shutdown. but that's fine too.

Ahh, you think this is because poll_shutdown() is called twice? (Looked this up, the std documentation notes that calling it a second time on Linux will yield Ok(()) but on macOS it will yield io::ErrorKind::NotConnected.)

#41 states it's happening because the peer closed the connection.

@djc
Copy link
Member Author

djc commented Feb 20, 2024

Pushed changes to:

  • Fuse poll_shutdown() so that it will yield Ready(Ok(())) on future calls after a successfull poll_shutdown()
  • break out of the write loop and fuse on NotConnected

@quininer
Copy link
Member

I'm think about something like this

fn poll_shutdown(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<io::Result<()>> {
    while !self.start_shutdown && self.session.wants_write() {
        ready!(self.write_io(cx))?;
    }

    self.start_shutdown = true;
    Pin::new(&mut self.io).poll_shutdown(cx)
}

I believe this is because the signal has been emitted when io.shutdown was called, but still returned pending, and when poll_shutdown is called for second time, write_io will return an error.

@cpu
Copy link
Member

cpu commented Feb 20, 2024

Looked this up, the std documentation notes that calling it a second time on Linux will yield Ok(()) but on macOS it will yield io::ErrorKind::NotConnected.
...
I believe this is because the signal has been emitted when io.shutdown was called, but still returned pending, and when poll_shutdown is called for second time, write_io will return an error.

Interesting! For what it's worth yesterday I spent a little bit of time trying to write a test for this condition. I couldn't manage to get shutdown to error despite trying a few different approaches. In hindsight I was testing on Linux and so was likely on a pointless quest.

@Revanee Out of curiosity, in #41 when you were observing this issue were you testing on MacOS?

@Revanee
Copy link

Revanee commented Feb 20, 2024

@cpu I've tested it on Arch Linux with kernel 6.7.0-arch3-1. The example in the issue should be enough to reproduce the error. Here is my Cargo.toml with the dependency versions I used:

[package]
name = "proxy_test"
version = "0.1.0"
edition = "2021"

[dependencies]
hyper = { version = "1.1.0", features = ["full"] }
hyper-util = { version = "0.1.1", features = ["full"] }
tokio = { version = "1.18.1", features = ["macros", "rt-multi-thread", "sync"] }
tokio-native-tls = "0.3.1"
tokio-rustls = "0.25.0"
rustls-pemfile = "2.0.0"

@cpu
Copy link
Member

cpu commented Feb 20, 2024

I've tested it on Arch Linux with kernel 6.7.0-arch3-1

🤔 Very interesting. Thanks!

The example in the issue should be enough to reproduce the error.

Thanks I'll take a look. I didn't start with this repo since my goal was to figure out an in-tree test (e.g. not using a curl client or extra deps).

@djc
Copy link
Member Author

djc commented Feb 28, 2024

@Revanee would you be able to test whether your issue is fixed with the proposal from @quininer?

@Revanee
Copy link

Revanee commented Feb 28, 2024

@djc I tested both the changes from the PR, as well as the proposed changes from @quininer. Neither seem to fix the issue for me. In both cases Pin::new(&mut self.io).poll_shutdown(cx) returns the error Err(Os { code: 107, kind: NotConnected, message: "Transport endpoint is not connected" }).

@djc
Copy link
Member Author

djc commented Feb 29, 2024

Okay, I've pushed a simpler fix, can you test this?

@djc
Copy link
Member Author

djc commented Mar 5, 2024

@Revanee do you have time to test this soon? We'd like to get a new release out soonish with the rustls 0.23 upgrade, and it would be nice to ship a fix for this at the same time.

@Revanee
Copy link

Revanee commented Mar 5, 2024

@djc I tested the new fix and it works! :)

@djc
Copy link
Member Author

djc commented Mar 5, 2024

Thanks! @quininer what do you think of this? It seems like the minimal fix that addresses the observed issue, we can always extend this to more cases if they are encountered in the wild by someone.

@djc djc requested a review from quininer March 5, 2024 11:10
Copy link
Member

@quininer quininer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@quininer
Copy link
Member

quininer commented Mar 5, 2024

I think this is good, at least it doesn't bring risks.

@djc djc merged commit fd3724e into main Mar 5, 2024
6 checks passed
@djc djc deleted the shutdown-not-connected branch March 5, 2024 12:02
jbr added a commit to jbr/futures-rustls that referenced this pull request Mar 19, 2024
@cpu cpu mentioned this pull request Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error NotConnected when serving a TlsStream<TcpStream> when using hyper
5 participants