Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Header encoding error for custom response headers: Invalid character in header content ["custom-response-header"] #343

Open
2 tasks done
jcbain opened this issue Mar 22, 2024 · 3 comments
Labels

Comments

@jcbain
Copy link
Contributor

jcbain commented Mar 22, 2024

Prerequisites

  • I have written a descriptive issue title
  • I have searched existing issues to ensure the bug has not already been reported

Fastify version

4.26.2

Plugin version

9.4.0

Node.js version

18

Operating system

Linux

Operating system version (i.e. 20.04, 11.3, 10)

debian 10

Description

Our proxying mechanism is experiencing an uncaught exception for custom response header values that contain characters such as é with the following error message Invalid character in header content ["custom-response-header"]. When proxying traffic, we can't control all of the possible header values upstream services might decorate a reply with and I was curious on your all's thoughts on handling situations with these special characters.

Steps to Reproduce

Create an upstream that sets an invalid response header:

const httpProxy = require('@fastify/http-proxy');
const fastify = require('fastify');

const app = fastify();

app.get('/', (request, reply) => {
  reply.header('custom-response-header', 'éeek');
  return { message: 'hello world' }
});

app.listen({ port:4001 });

const proxy = fastify();

proxy.register(httpProxy, { 
   upstream: 'http://localhost:4001'
});

proxy.listen({ port: 6001 });

Expected Behavior

When looking at the RFC8187 section 3.2.1 (https://www.rfc-editor.org/rfc/rfc8187#section-3.2.1), it specifies the following, which seems as if it should be handled in some way instead of throwing.

      Note: Recipients should be prepared to handle encoding errors,
      such as malformed or incomplete percent escape sequences, or
      non-decodable octet sequences, in a robust manner.  This
      specification does not mandate any specific behavior; for
      instance, the following strategies are all acceptable:

      *  ignoring the parameter,

      *  stripping a non-decodable octet sequence, and

      *  substituting a non-decodable octet sequence by a replacement
         character, such as the Unicode character U+FFFD (Replacement
         Character).
@mcollina
Copy link
Member

I agree! This should be handled somehow. Would you like to send a PR?

@mcollina mcollina added the bug label Mar 22, 2024
@q0rban
Copy link

q0rban commented Mar 22, 2024

After reading RFC 8187 a little closer, it seems that it only applies to headers that use parameters:

This document specifies an encoding suitable for use in HTTP header fields that is compatible with a simplified profile of the encoding defined in RFC 2231. It can be applied to any HTTP header field that uses the common "parameter" ("name=value") syntax.

So I think that means you can only specify an encoding on an http header that uses parameter syntax. For example, the content-disposition header:

https://nodejs.org/docs/latest/api/http.html#requestsetheadername-value

const filename = 'Rock 🎵.txt';
request.setHeader('Content-Disposition', `attachment; filename*=utf-8''${encodeURIComponent(filename)}`);

So, let's say a content-disposition header is received by the proxy, the filename contains non-ascii characters, and is not encoded using the syntax described in RFC 8187. In that scenario, should the proxy return a 5xx error, should it strip the header, or should it attempt to properly encode the header following RFC 8187 (this seems complicated and would likely incur a performance penalty)?

Alternately, let's say a header is not in parameter syntax, and has non-ASCII characters in it. In that case, we cannot use the encoding approach outlined in RFC 8187. Should we url encode the string, regex-replace non-ASCII characters with ?, or some other approach?

Perhaps a good first step is to come up with a variety of example / test cases.

@jcbain
Copy link
Contributor Author

jcbain commented Mar 22, 2024

Good catch. So I think then the relevant section is 3.2.4 of RFC 7230 https://www.rfc-editor.org/rfc/rfc7230#section-3.2.4.

Historically, HTTP has allowed field content with text in the
ISO-8859-1 charset [ISO-8859-1], supporting other charsets only
through use of [RFC2047] encoding. In practice, most HTTP header
field values use only a subset of the US-ASCII charset [USASCII].
Newly defined header fields SHOULD limit their field values to
US-ASCII octets. A recipient SHOULD treat other octets in field
content (obs-text) as opaque data.

Would this indicate that a proxy should be using the replacement character for those that fall outside of this standard per this line "A recipient SHOULD treat other octets in field content (obs-text) as opaque data"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants