Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support "CONNECT" requests (in proxies) #481

Open
jcarlsonautomatiq opened this issue Nov 8, 2023 · 11 comments
Open

Support "CONNECT" requests (in proxies) #481

jcarlsonautomatiq opened this issue Nov 8, 2023 · 11 comments
Labels
enhancement New feature or request

Comments

@jcarlsonautomatiq
Copy link

MSW is a great library and I have used it successfully in other "got" calls but in one part of my unit testing I was actually checking that a proxy call would actually try and use the proxy. Unfortunately I think the way that the path and proxy information is generated results in an invalid URL which bails before you can get a matcher to handle it. The proxy host is correctly assigned but then the path(the actual url we call) is just directly concatenated as a path.


   // From getUrlByRequestOptions.ts:169
   =============
   const url = new URL(`${protocol}//${hostname}${path}`)
   url.username = credentials?.username || ''
   url.password = credentials?.password || ''

   // ERROR
   =============
          input: 'http://fake.proxy:443fake.ping:443',
          code: 'ERR_INVALID_URL',

            at new URL (node:internal/url:783:36)
            at getUrlByRequestOptions (/node_modules/.pnpm/@[email protected]/node_modules/@mswjs/interceptors/lib/node/chunk-YVNH3GJ5.js:640:15)
            at normalizeClientRequestArgs (/node_modules/.pnpm/@[email protected]/node_modules/@mswjs/interceptors/lib/node/chunk-YVNH3GJ5.js:766:11)

 
// From the DEBUG:
=============
3:51:32:307 [http normalizeClientRequestArgs] successfully cloned RequestOptions! {"rejectUnauthorized":false,"agent":{"_events":{},"_eventsCount":2,"defaultPort":443,"protocol":"https:","options":{"keepAlive":true,"keepAliveMsecs":1000,"maxSockets":256,"maxFreeSockets":256,"scheduling":"lifo","noDelay":true,"path":null},"requests":{},"sockets":{},"freeSockets":{},"keepAliveMsecs":1000,"keepAlive":true,"maxSockets":256,"maxFreeSockets":256,"scheduling":"lifo","maxTotalSockets":null,"totalSocketCount":0,"maxCachedSessions":100,"_sessionCache":{"map":{},"list":[]},"proxy":"http://proxyUsername:[email protected]:443/"},"setHost":true,"method":"GET","headers":{"user-agent":"got (https://github.com/sindresorhus/got)","accept-encoding":"br, gzip, deflate"}}
13:51:32:307 [http normalizeClientRequestArgs] derived request options: {"method":"GET","protocol":"https:","hostname":"fake.ping","host":"fake.ping","path":"/pinging","rejectUnauthorized":false,"agent":{"_events":{},"_eventsCount":2,"defaultPort":443,"protocol":"https:","options":{"keepAlive":true,"keepAliveMsecs":1000,"maxSockets":256,"maxFreeSockets":256,"scheduling":"lifo","noDelay":true,"path":null},"requests":{},"sockets":{},"freeSockets":{},"keepAliveMsecs":1000,"keepAlive":true,"maxSockets":256,"maxFreeSockets":256,"scheduling":"lifo","maxTotalSockets":null,"totalSocketCount":0,"maxCachedSessions":100,"_sessionCache":{"map":{},"list":[]},"proxy":"http://proxyUsername:[email protected]:443/"},"setHost":true,"headers":{"user-agent":"got (https://github.com/sindresorhus/got)","accept-encoding":"br, gzip, deflate"}}
13:51:32:307 [http normalizeClientRequestArgs] has no default agent, setting the default agent for "https:"
13:51:32:307 [http normalizeClientRequestArgs] successfully resolved url: https://fake.ping/pinging
13:51:32:307 [http normalizeClientRequestArgs] successfully resolved options: {"method":"GET","protocol":"https:","hostname":"fake.ping","host":"fake.ping","path":"/pinging","rejectUnauthorized":false,"agent":{"_events":{},"_eventsCount":2,"defaultPort":443,"protocol":"https:","options":{"keepAlive":true,"keepAliveMsecs":1000,"maxSockets":256,"maxFreeSockets":256,"scheduling":"lifo","noDelay":true,"path":null},"requests":{},"sockets":{},"freeSockets":{},"keepAliveMsecs":1000,"keepAlive":true,"maxSockets":256,"maxFreeSockets":256,"scheduling":"lifo","maxTotalSockets":null,"totalSocketCount":0,"maxCachedSessions":100,"_sessionCache":{"map":{},"list":[]},"proxy":"http://proxyUsername:[email protected]:443/"},"setHost":true,"headers":{"user-agent":"got (https://github.com/sindresorhus/got)","accept-encoding":"br, gzip, deflate"},"_defaultAgent":{"_events":{},"_eventsCount":2,"defaultPort":443,"protocol":"https:","options":{"keepAlive":true,"scheduling":"lifo","timeout":5000,"noDelay":true,"path":null},"requests":{},"sockets":{},"freeSockets":{},"keepAliveMsecs":1000,"keepAlive":true,"maxSockets":null,"maxFreeSockets":256,"scheduling":"lifo","maxTotalSockets":null,"totalSocketCount":0,"maxCachedSessions":100,"_sessionCache":{"map":{},"list":[]}}}
13:51:32:308 [http normalizeClientRequestArgs] successfully resolved callback: undefined
13:51:32:310 [http request] request call (protocol "http"): [{"method":"CONNECT","host":"fake.proxy","port":"443","path":"fake.ping:443","setHost":false,"headers":{"connection":"keep-alive","host":"fake.ping:443","proxy-authorization":"Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk"},"agent":false}]
13:51:32:310 [http normalizeClientRequestArgs] arguments [{"method":"CONNECT","host":"fake.proxy","port":"443","path":"fake.ping:443","setHost":false,"headers":{"connection":"keep-alive","host":"fake.ping:443","proxy-authorization":"Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk"},"agent":false}]
13:51:32:310 [http normalizeClientRequestArgs] using default protocol: http:
13:51:32:310 [http normalizeClientRequestArgs] first argument is RequestOptions: {"method":"CONNECT","host":"fake.proxy","port":"443","path":"fake.ping:443","setHost":false,"headers":{"connection":"keep-alive","host":"fake.ping:443","proxy-authorization":"Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk"},"agent":false}
13:51:32:310 [http normalizeClientRequestArgs] normalized request options: {"method":"CONNECT","host":"fake.proxy","port":"443","path":"fake.ping:443","setHost":false,"headers":{"connection":"keep-alive","host":"fake.ping:443","proxy-authorization":"Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk"},"agent":false,"protocol":"http:"}
13:51:32:310 [utils getUrlByRequestOptions] request options {"method":"CONNECT","host":"fake.proxy","port":"443","path":"fake.ping:443","setHost":false,"headers":{"connection":"keep-alive","host":"fake.ping:443","proxy-authorization":"Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk"},"agent":false,"protocol":"http:"}
13:51:32:310 [utils getUrlByRequestOptions] figuring out url from request options...
13:51:32:310 [utils getUrlByRequestOptions] protocol http:
13:51:32:310 [utils getUrlByRequestOptions] host fake.proxy
13:51:32:310 [utils getUrlByRequestOptions] port 443
13:51:32:310 [utils getUrlByRequestOptions] hostname fake.proxy:443
13:51:32:310 [utils getUrlByRequestOptions] path fake.ping:443
13:51:32:310 [utils getUrlByRequestOptions] credentials undefined
13:51:32:310 [utils getUrlByRequestOptions] auth string:

Code that pretty much reproduces it.

// Pseudo code reproduction
const { HttpsProxyAgent } = require('hpagent');
const host = 'fake.proxy';
const username = "proxyUsername";
const password = "proxyPassword";
const port = '443';

   let proxy =  HttpsProxyAgent({
      keepAlive: true,
      keepAliveMsecs: 1000,
      maxSockets: 256,
      maxFreeSockets: 256,
      scheduling: 'lifo',
      proxy: `http://`${username}:${password}@${host}:${port}`,
    });


  // Then used in a got call like this:
 export const PING_URL  = "https://fake.ping/pinging";
 got.get({url: PING_URL, agent: { https: proxy})

@kettanaito
Copy link
Member

Hey, @jcarlsonautomatiq. Thanks for reporting this.

Not sure what's going off here. The path option in ClientRequest must always be a string starting with /. It appears that when using hpagent, it's not the case. This may as well be one of the many real-world use cases that are tricky and have to be handled in getUrlByRequestOptions.

We need to add a test for this.

@kettanaito
Copy link
Member

@jcarlsonautomatiq can you please share how you intend to mock those requests?

@kettanaito
Copy link
Member

Your original example also has a mistake: got cannot accept that proxy agent as the value of agents.https. Doing so produces both TypeScript and runtime errors.

TypeScript:

No overload matches this call.
  The last overload gave the following error.
    Property 'options' is missing in type 'HttpProxyAgent' but required in type 'Agent'.ts(2769)
https.d.ts(31, 9): 'options' is declared here.

Runtime:

RequestError: Protocol "https:" not supported. Expected "http:"
 ❯ Request._makeRequest ../node_modules/.pnpm/[email protected]/node_modules/got/dist/source/core/index.js:1191:19
 ❯ new ClientRequest ../node:_http_client:189:11

I can help with fixing this but I need an exact example that is complete and working (well, failing, in our case).

@jcarlsonautomatiq
Copy link
Author

This code is currently being migrated from raw working but untested JS to Typescript which was one of the reasons I was actually adding the unit tests. I wonder if there is an update in got as well we are on "got-cjs": "^12.0.1" and it doesn't cause a protocol error. I will see about stripping this down to remove all the company stuff.

As for testing it I was hoping I would be able to intercept the proxy url and then look at the original request to determine what to return (or just grab a ?url out of the call). A little hacky but it would let me use the proxy calls and ensure I didn't break anything in the migration. I will hack up a proper working snippet this morning.

@kettanaito
Copy link
Member

kettanaito commented Nov 9, 2023

The issue with the path originates from here:
https://github.com/delvedor/hpagent/blob/96f45f1d40bfbdfd0fcc84cdba056be6e0fb8f4c/index.js#L23

I'm not sure what's the intention here by providing the following request options to create a request:

[
  {
    method: 'CONNECT',
    host: 'fake.proxy',
    port: '443',
    path: 'fake.ping:80',
    setHost: false,
    headers: {
      connection: 'keep-alive',
      host: 'fake.ping:80',
      'proxy-authorization': 'Basic cHJveHlVc2VybmFtZTpwcm94eVBhc3N3b3Jk'
    },
    agent: false,
    timeout: 0
  }
]

What's the intended end request URL here, I wonder?

I can assume it's https://fake.proxy/fake.ping:80. Once I do that, we get the next error:

RequestError: 'CONNECT' HTTP method is unsupported.

And this one is far more interesting. See, the Fetch API specification forbids creating requests with CONNECT method:

new Request('/resource', { method: 'CONNECT' })

Security report

This fails standalone.

@jcarlsonautomatiq
Copy link
Author

jcarlsonautomatiq commented Nov 9, 2023

The code is using some long lived proxies from Nimble. I expect that might be some of the strange connection issue but that is definitely odd. And apologies I should have had the fully working example in place initially.

    import { HttpsProxyAgent } from 'hpagent';  // version: <edit>. Updated to 1.2.0 and it still will repro.
    import { got } from 'got-cjs';  // version: 12.5.4

    const host = 'fake.proxy';
    const username = "proxyUsername";
    const password = "proxyPassword";
    const port = '443';
    
    let proxyUrl = `http://${username}:${password}@${host}:${port}`;
    let proxy = new HttpsProxyAgent({
       keepAlive: true,
       keepAliveMsecs: 1000,
       maxSockets: 256,
       maxFreeSockets: 256,
       scheduling: 'lifo',
       proxy: proxyUrl,
     });
    
      // Then used in a got call like this:
     const PING_URL  = "https://fake.ping/pinging";
     await got.get({
        url: PING_URL,
        throwHttpErrors: false,   // No idea the code thought it should do this?
        agent: { https: proxy},
        https: { rejectUnauthorized: false },
      });

@jcarlsonautomatiq
Copy link
Author

jcarlsonautomatiq commented Nov 9, 2023

It seems like if the invalid URL error is fixed this code test will probably have to be moved to the browser version of the MSW lib and only tested e2e which should be fine. Plus eventually I will run into the same agent and type conflicts :)

I will look more at the hpagent code that it might need a note about not being used in Node directly. Or forcing a custom http agent when used with a keepAlive connection outside a browser. odd ad of Nodejs > 19 supposedly keepAlive works with native fetch but finding definitive info about CONNECT in the spec is surprisingly hard.

https://stackoverflow.com/questions/62500011/reuse-tcp-connection-with-node-fetch-in-node-js

@kettanaito
Copy link
Member

Okay, so learned a lot from the Node.js docs, Got, and hpagent here. Sharing below.

The URL is invalid, or is it?

Turns out, when performing the CONNECT method in ClientRequest, the path option is expected to point to the target host:

const options = {
  port: 1337,
  host: '127.0.0.1',
  method: 'CONNECT',
  path: 'www.google.com:80',
}

const req = http.request(options)

Excerpt from the example in Node.js docs

  • We must handle this in Interceptors

Moreover, when the server receives that request, the req.url already equals to www.google.com:80. This is some Node.js server magic I presume, but Interceptors must do the same thing (req.method remains CONNECT).

Request flow during proxying

This is what happens to the request in a simplified proxy setup (following the same Node.js example):

  1. Request is constructed with CONNECT method and <target> path.
  2. Server receives the connect event where req.method is CONNECT and req.url is <target>.
  3. Server creates a new connection to req.url.
  4. Once the connection is established to the target host, server writes to the request the 200 Connection <target> HTTP message (direct request.write()).
  5. Server then writes the head from the connect event to the request also.
  6. Request then emits the connect event with three arguments:
  7. res, IncomingMessage. This is the `200 Connection `` response written by the server to the request (the header of the connection HTTP message).
  8. socket, net.Socket, the socket of the new connection to the <target>.
  9. head, Buffer, whatever head server has written in its connect event.

In the connect request callback, you can then issue new requests by writing directly onto the target socket and receiving the response from the target via socket.on('data').

I assume that last part is handled by hpagent here because it's rather low-level.

@kettanaito kettanaito changed the title Proxy URL is not generated correctly and causes Invalid URL error from node:internal Support "CONNECT" requests (in proxies) Nov 9, 2023
@kettanaito
Copy link
Member

kettanaito commented Nov 9, 2023

Basically, we don't support this message exchange right now. This is a new feature request and I will prioritize it accordingly. Will push the WIP pull request I have right now with the failing test and some findings. Thanks for making this apparent, @jcarlsonautomatiq!

I'm not sure as of now how to properly support this. Since the connect event on ClientRequest exposes an extraneous Socket, it means we have to support the socket-level messaging in Interceptors (see #399). This isn't that easy and I'd rather avoid it if we can.

@kettanaito kettanaito added the enhancement New feature or request label Nov 9, 2023
@jcarlsonautomatiq
Copy link
Author

jcarlsonautomatiq commented Nov 9, 2023

Thanks for putting all the information in here it makes it a lot easier to think it through! If I might recommend a possible test framework solution I think just about everyone would accept it might go like this (I certainly would):

  • Tell people that proxy code with a keepAlive is currently out of scope in the docs which is completely reasonable.
  • Allow for a setting called 'proxyBehavior': transparent(default)|error|etc which the interceptor can look for and then just ignore the proxy and use the normal matchers for the destination URL call and build the url that way with the eventual actual method call (GET|POST|ETC). The headers would still have the proxy information so you could test your proxy was 'used'.
  • Allow for a setting called 'youCanImplementYourProxyIfYourAreInsane' which gives people the ability to define a 'proxy' function that can run before the interceptors and do any header or rejections as required.

@kettanaito
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants