Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error codes: provide error codes on stream reset and connection close #623

Draft
wants to merge 3 commits into
base: master
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
97 changes: 97 additions & 0 deletions error-codes/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Error Codes
sukunrt marked this conversation as resolved.
Show resolved Hide resolved

## Introduction
In the event that a node detects violation of a protocol or is unable to
complete the necessary steps required for the protocol, it's useful to provide a
reason for disconnection to the other end. This error code can be sent on both
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's broader than this. This doesn't only apply to protocol violations (which should be rare), but also to common events like running into resource limits, connections being pruned by the connection manager, etc.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rephrased this and removed the specific reference to protocol errors.

Connection Close and Stream Reset. Its purpose is similar to http response
status. A server, on receiving an invalid request can reset the stream providing
a `BAD_REQUEST` error code, when it's busy handling too many requests can
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
provide a `RATE_LIMITED` error code, etc. An error code doesn't always indicate
an error condition. For example, a connection may be closed prematurely because
a connection to the same peer on a better transport is available.
MarcoPolo marked this conversation as resolved.
Show resolved Hide resolved

## Semantics
Error codes are unsigned 32bit integers. The range 0 to 10000 is reserved for
libp2p errors. Application specific errors can be defined by protocols from
integers outside of this range. Implementations supporting error codes MUST
provide the error code provided by the other end to the application.

Error codes provide a best effort guarantee that the error will be propogated to
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
the applciation layer. This provides backwards compatibility with older nodes,
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
allows smaller implementations and using transports that don't provide a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we mean "less popular" here maybe?

Suggested change
allows smaller implementations and using transports that don't provide a
allows less popular implementations that are using transports that don't provide a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about popularity here - maybe "simpler" instead of "smaller"?

Either way this needs an Oxford comma:

Suggested change
allows smaller implementations and using transports that don't provide a
allows smaller implementations, and using transports that don't provide a

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually do prefer "smaller" in terms of code size smaller. I like simpler too. Happy to change if you have strong opinions.

mechanism to provide an error code. For example, Yamux has no equivalent of
QUIC's [STOP_SENDING](https://www.rfc-editor.org/rfc/rfc9000.html#section-3.5-4)
frame that would tell the peer that the node has stopped reading. So there's no
way of signaling an error while closing the read end of the stream on a yamux
connection.

### Connection Close and Stream Reset Error Codes
Error codes are defined separately for Connection Close and Stream Reset. Stream
Reset errors are from the range 0 to 5000 and Connection Close errors are from
the range 5001 to 10000. Having separate errors for Connection Close and stream
reset requires some overlap between the error code definitions but provides more
information to the receiver. Receiving a `Bad Request: Connection Closed` error
on reading from a stream also tells the receiver that no more streams can be
started on the same connection. Implementations MUST provide the Connection
Close error code on streams that are reset as a result of remote closing the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Close error code on streams that are reset as a result of remote closing the
Close error code on streams that are reset as a result of the remote closing the

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this?

connection.

For stream resets, when the underlying transport supports it, implementations
SHOULD allow sending an error code on both closing the read side of the stream
and resetting the write side of the stream.

## Libp2p Error Codes
TODO!
sukunrt marked this conversation as resolved.
Show resolved Hide resolved

## Wire Encoding
Different transports will encode the 32bit error code differently.

### QUIC
QUIC provides the ability to send an error on closing the read end of the
stream, reseting the write end of the stream and on closing the connection.

For stream resets, the error code MUST be sent on the `RESET_STREAM` or the
`STOP_SENDING` frame using the `Application Protocol Error Code` field as per
the frame definitions in the
[RFC](https://www.rfc-editor.org/rfc/rfc9000.html#name-reset_stream-frames).

For Connection Close, the error code MUST be sent on the CONNECTION_CLOSE frame
using the Error Code field as defined in the
[RFC](https://www.rfc-editor.org/rfc/rfc9000.html#section-19.19-6.2.1).

### Yamux
Yamux has no `STOP_SENDING` frame, so there's no way to signal an error on
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's an addition we could add to the spec to support this in a backwards compatible way. Not worth doing now, but maybe could be added in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some suggestions for the wire encoding in #479. You'd have to check if existing implementations handle this gracefully.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marten-seemann do you mean suggestions for sending an error code? There's this PR for yamux for that #622

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoPolo I think we can add this, there are 16 flag bits, so we do have space for a new flag and implementations only look at the flag bits they care about. I created #627

closing the read side of the stream.

For Connection Close, the 32bit Length field is to be interpreted as the error
code. The error code MUST be sent as an Big Endian unsigned 32 bit integer.
sukunrt marked this conversation as resolved.
Show resolved Hide resolved

For Stream Resets, the error code is sent as the first 4 bytes of the Data Frame
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking this here since this will change: #622 (comment)

following the header with RST flag set as defined in the [yamux spec
extension](https://github.com/libp2p/specs/pull/622).

Note: On TCP connections with `SO_LINGER` set to 0, the error code on connection
close may not be delivered.

### WebRTC
Since WebRTC doesn't provide reliable delivery for frames that are followed by
closing of the underlying data channel, there's no simple way to introduce error
codes in the current implementation. Consider the most common use case of
resetting both read and write sides of the stream and closing the data channel.
The chrome implementation will not deliver the final `RESET_STREAM` msg and
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
while the go implementation will delivery the `RESET_STREAM` frame and then
sukunrt marked this conversation as resolved.
Show resolved Hide resolved
close the data channel, there's no guarantee that the chrome implementation will
provide the `RESET_STREAM` msg to the application layer after it receives the
data channel close message.

### WebTransport
Error codes for webtransport will be introduced when browsers upgrade to draft-9
of the spec. The current webtransport spec implemented by chrome and firefox is
[draft-2 of webtransport over
http3](https://www.ietf.org/archive/id/draft-ietf-webtrans-http3-02.html#section-4.3-2).
This version allows for only a 1 byte error code. 1 byte is too restrictive and
as the latest webtransport draft,
(draft-9)[https://www.ietf.org/archive/id/draft-ietf-webtrans-http3-02.html#section-4.3-2]
allows for a 4 byte error code to be sent on stream resets, we will introduce
error codes over webtransport later.