Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File upload object is empty in federated gateway but not if called directly on the service #427

Open
ghost opened this issue Jul 13, 2019 · 36 comments

Comments

@ghost
Copy link

ghost commented Jul 13, 2019

After federating a service that implements the upload scalar, for some reason, if I hit that resolver from the federated gateway, the file object is empty.

(base) cL:gateway chris$ npm run start-apollos

> [email protected] start-apollos /Users/chris/projects/bx/gateway
> concurrently "npm:start-apollo-*"

[start-apollo-data]
[start-apollo-data] > [email protected] start-apollo-data /Users/chris/projects/bx/gateway
[start-apollo-data] > NODE_ENV=development node -r ts-node/register ./src/apollos/DataLibrary/index.ts
[start-apollo-data]
[start-apollo-accounts]
[start-apollo-accounts] > [email protected] start-apollo-accounts /Users/chris/projects/bx/gateway
[start-apollo-accounts] > node -r ts-node/register ./src/apollos/Accounts/index.ts
[start-apollo-accounts]
[start-apollo-bioref]
[start-apollo-bioref] > [email protected] start-apollo-bioref /Users/chris/projects/bx/gateway
[start-apollo-bioref] > node -r ts-node/register ./src/apollos/BioRef/index.ts
[start-apollo-bioref]
[start-apollo-accounts] Account apollo running at http://localhost:9001/
[start-apollo-bioref] BioRef Apollo running at http://localhost:9003/
[start-apollo-data] 🚀 Server ready at http://localhost:9002/graphql
[start-apollo-data] { file: {}, something: 'hello' }
[start-apollo-data] { file: {}, something: 'hello' }

But then if I hit the service directly, everything is fine.

[start-apollo-data] 🚀 Server ready at http://localhost:9002/graphql
[start-apollo-data] Promise {
[start-apollo-data]   { filename: 'test.csv',
[start-apollo-data]     mimetype: 'text/csv',
[start-apollo-data]     encoding: '7bit',
[start-apollo-data]     createReadStream: [Function: createReadStream] } }

Is there something I am doing wrong with the apollo-gateway?

For reference,

// TypeDefs
export const typeDefs = gql`
  scalar Upload

  extend type Mutation {
    testUpload(file: Upload, something: String): String
  }
`;
// *Resolvers

import { GraphQLUpload } from 'graphql-upload';

const testUpload = async (obj, { file }, ctx, info) => {
  const { createReadStream } = await file;
  console.log(file);
  return '';
};
export const resolvers = {
  Upload: GraphQLUpload,
  Mutation: {
     testUpload(obj, args, ctx, info) {
        return testUpload(obj, args, ctx, info)
     }
  },
};

For now, I've worked around it by registering another client that directly hits the service instead of the gateway just to do file uploads, but I feel like this defeats the point of federating in the first place.

Any suggestions/guidance would be greatly appreciated!

@ghost ghost changed the title File upload object is empty when resolver is hit from gateway but not if called directly File upload object is empty in federarted gateway but not if called directly on the service Jul 13, 2019
@ghost ghost changed the title File upload object is empty in federarted gateway but not if called directly on the service File upload object is empty in federated gateway but not if called directly on the service Jul 13, 2019
@lbaker2
Copy link

lbaker2 commented Aug 11, 2019

So when the file is uploaded it hits the gateway first. File uploads are written to temporary files and the data can be streamed into your application. You need to either resolve the upload in your gateway and pass to your downstream service as a buffer or find a different mechanism of forwarding the file data to your service. If you inspect the request variables at the gateway level you should see the same promise object mentioned above. Hope this helps

@henry74
Copy link

henry74 commented Aug 12, 2019

@lbaker2 are you offering a solution in the context of the gateway, explaining how it currently works, or something else?

@lbaker2
Copy link

lbaker2 commented Aug 12, 2019

@henry74 a little bit of hinting at a solution and explaining how it currently works.

@henry74
Copy link

henry74 commented Aug 15, 2019

I believe this misses the spirit/purpose of the federated gateway if we have to add our own user code:

That’s it! With Apollo Federation, your schemas and resolvers live in the individual services. There is no user code in the gateway, just a reference to each of the federated services that make up the graph.

@lbaker2
Copy link

lbaker2 commented Aug 18, 2019

@henry74 I completely agree about not adding application logic or user code into the gateway. There is an option for RemoteGraphQLDataSource called willSendRequest where any unresolved promises can be resolved prior to sending the request to a downstream service. One could always extend RemoteGraphQLDataSource to achieve this behavior by default.

@jacob-ebey
Copy link

Has anyone gotten file upload through the gateway working? I'm not familiar enough with RemoteGraphQLDataSource and an example would be great.

@lbaker2
Copy link

lbaker2 commented Aug 23, 2019

Yes. It’s very easy if both your gateway and service that needs the upload are on the same server. I am in the process of writing about it.

@henry74
Copy link

henry74 commented Aug 23, 2019

Share what you're thinking, but we live in the world of docker containers and kubernetes so requiring the upload service to live in the same "server" won't work for our use case.

@jacob-ebey
Copy link

Same... The upload/images services are not even owned by the same teams... Hence the point of federation...

@lbaker2
Copy link

lbaker2 commented Aug 23, 2019

@henry74 @jacob-ebey both great points. I meant to say it is easy to illustrate a solution when both the gateway and service live on the same "server". @henry74 we aren't using containers, so I hadn't thought about that. However, if there is a shared volume I think you can achieve the same behavior... That being said, every solution I have found so far has required some tweaking of the downstream services that consume the files. I'm going to throw some code in here, so please be kind. I'm just trying to help. Starting from a very simple gateway configuration -

const gateway = new ApolloGateway({
  serviceList,
  buildService({ url }) {
    return new RemoteGraphQLDataSource({
      url,
      async beforeSendRequest({ request, context }) {
        // your upload should have
        // Promise { { filename, mimetype, encoding, createReadStream } }
        // if you dont resolve the promise, then and empty object will be sent
        console.log(request.variables)
      }

You have a few options at this point: forward the file as buffer, send the temp file location, or stream the file to shared volume and send temp file location. All the examples exclude error handling and assume the upload to be at request.variables.myUpload

Sending as a buffer

// sending as a buffer, though I would not recommend it
async beforeSendRequest({ request, context }) {
  const myUpload = await request.variables.myUpload
  let stream = await file.createReadStream()
  await new Promise((resolve, reject) => {
    let buffers = []
    stream.on('error', error => reject(error))
    stream.on('data', data => buffers.push(data))
    stream.on('end', () => {
      request.variables.myUpload.file = Buffer.concat(buffers)
      resolve()
    })
  }
}

// in the service your resolver could read the buffer
(_, args) => {
  const { myUpload: { file: { data } } } = args
  // your upload data is now available as a buffer
  const buffer = Buffer.from(data)
 // ...
}

Sharing the file via /tmp

// sharing the temporary file location
async beforeSendRequest({ request, context }) {
  const myUpload = await request.variables.myUpload
  let stream = await file.createReadStream()
  request.variables.myUpload.filepath = stream.path
}

// read the file from your resolver
// for example if you use sharp to process images you could do

(_, args) => {
  const { myUpload: { filepath } } = args
  await sharp(filepath).png().toFile('/your/final/destination')
  // ...
}

Sharing the file via shared volume

// sharing the temporary file location
async beforeSendRequest({ request, context }) {
  const myUpload = await request.variables.myUpload
  const location = '/path/to/shared/volume/myUpload'
  let writeStream = fs.createWriteStream()
  let readStream = await file.createReadStream()
  await new Promise((resolve, reject) => {
    writeStream.on('error', () => {
       reject()
    })
    writeStream.on('finish', () => {
       resolve()
    })
    readStream.pipe(writeStream)
  })
  request.variables.myUpload.filepath = location
}

// read the file from your resolver
// for example if you use sharp to process images you could do

(_, args) => {
  let { myUpload: { filepath } } = args
  await sharp(filepath).png().toFile('/your/final/destination')
  // ...
}

Obviously, it's not ideal to change resolvers in your services... hopefully, this helps someone. One last thing... you can extend RemoteGraphQLDataSource to iterate over your variables and use Promise.all to resolve them.

@lbaker2
Copy link

lbaker2 commented Aug 24, 2019

There is one other strategy I'd like to try. It would require the form-data package. I'd be happy with a solution that did not require any changes with resolvers, but converting the request to a multipart request is a little tricky. We'd have to take a look at the spec https://github.com/jaydenseric/graphql-multipart-request-spec that is used by apollo-upload-client

@jacob-ebey
Copy link

jacob-ebey commented Aug 25, 2019

@lbaker2, thanks for the response. I'll spend some time this weekend looking into the solutions.

@mfellner
Copy link

Our team is also interested in supporting GraphQL file uploads with Apollo federation. We're using Hapi which has it's own quirks, unfortunately (apollographql/apollo-server#1680, hapijs/hapi#3465).

I was finally able to make file uploads and federation work by implementing a custom RemoteGraphQLDataSource and using form-data, as @lbaker2 suggested:

https://github.com/mfellner/apollo-server-upload-example/blob/master/frontend-server/file-upload-data-source.js

One downside of multipart file uploads in Node.js seems to be that files may be fully loaded into memory or temporarily stored on disk (see Eran Hammer's comment here: hapijs/hapi#3465 (comment)). With multiple GraphQL gateways (like with Apollo federation) files need to be handled in this way on multiple servers. 😕

@lbaker2
Copy link

lbaker2 commented Aug 25, 2019

@mfellner Very nice!. I was about to implement the solution this morning, but now I can just enjoy my coffee :)

@ziplizard
Copy link

ziplizard commented Aug 27, 2019

I have an ApolloGateway server as well as a federated service - on separate boxes. On the service I'm able to upload directly following this spec: https://github.com/jaydenseric/graphql-multipart-request-spec. I want to be able to make some changes on the gateway, with minimal changes to the spec, to allow uploads from the gateway. How do I go about doing this?
This is the error I'm getting:
Missing multipart field ‘operations’

@lbaker2
Copy link

lbaker2 commented Aug 27, 2019

@ziplizard does the solution posted by mfellner meet your criteria?

@ziplizard
Copy link

ziplizard commented Aug 28, 2019

@lbaker2 I have implemented precisely how @mfellner has outlined, and it does exactly as he has specified. However, I've found his example is not complete. I'm able to return the uploaded filename, mimetype, and encoding. However, when I try to stream the data to S3, I get the following error:
Error: Unexpected end of multipart data
I've tried many different ways to get this upload file into the S3.Body, but cannot see how to do it. Ideas?

@ziplizard
Copy link

It appears the header content-length is incorrect.

@ziplizard
Copy link

Has anyone actually gotten an upload into S3 (for example) using https://github.com/mfellner/apollo-server-upload-example/blob/master/frontend-server/file-upload-data-source.js ? This example's outcome is just to return the properties of the file. I'm unable to get the actual streamed upload in order to save it.

@ziplizard
Copy link

@mfellner were you able to save the stream from the federated service?

I have tried very simply this:

const { filename, mimetype, encoding, createReadStream } = await file;
const stream = await createReadStream();
const params = {
  Bucket: bucket,
  Key: filename,
  ContentType: mimetype,
  ContentEncoding: encoding,
  Body: stream,
};
return s3.upload(params).promise();

And get the following error:
Error: Unexpected end of multipart data

I'm very interested in how you are able to solve this. Thanks.

@jacob-ebey
Copy link

I feel like the gateway should just forward the file on by default to the federated service unless some other method of handling it is provided.

I also feel this "method of handling it" should be a set of interfaces, one that exists on the gateway and one on the federated service. This interface could be implemented in packages such as "@apollo/federated-files-fs" and "@apollo/federated-files-s3", etc...

Apollo could provide the "fs" and default forwarding implementations, and I'm sure we will build out the rest as needed.

With files being scalar types, and having these proposed packages provide a "middleware" for both the gateway and federated service, each of these packages could utilize whatever data-format is appropriate for the implementation while still allowing files to be sent directly to the federated service.

For example, the info sent from the gateway to a federated service for the "s3" package may just contain the bucket and key name of the file, allowing for tiny transfers between the gateway and the federated services.

The "tricky" part here will be determining what variables these "middlewares" provided by the packages would be responsible for. If the gateway has access to the schema types of the variables in the query, this should not be an issue. I have yet to dig into the gateway source though, so I'm not familiar with it.

Thoughts on this?

@ziplizard
Copy link

ziplizard commented Sep 3, 2019

To consider: The federated service should allow uploads directly without the gateway. There should be no dependencies to the gateway. This makes local development much much easier.

@bl42
Copy link
Contributor

bl42 commented Sep 4, 2019

And get the following error:
Error: Unexpected end of multipart data

@ziplizard did you have any success with this? I am facing the same issue and would be interested in knowing

@bl42
Copy link
Contributor

bl42 commented Sep 4, 2019

https://gist.github.com/bl42/881d85ee0dd92bbe68e93dcd0c99cfdb

So I got the Unexpected end of multipart data issue solved via buffers
It will go from the gateway to S3 buckets...

The response for some reason is missing the data payload... I have no idea why but I'll dig around more tomorrow.

EDIT: Fixed the response issue and I got my S3 use case working.

@ziplizard
Copy link

@bl42 I have not gotten to work yet, but will checkout your gist and work on it. Thanks.

@ziplizard
Copy link

It works! Thanks so much @bl42!

@edelreal
Copy link

edelreal commented Oct 17, 2019

This worked great for me, thanks @mfellner , @bl42 !

@yaacovCR
Copy link

@yaacovCR
Copy link

graphql-tools-fork v8.1.0 now provides support for proxying remote file uploads for those still using schema stitching who want to proxy all the things.

Instead of using Buffers to read the stream into memory, we extend the FormData class to allow streams of unknowable length, and intercept all form.getLength() and getLengthSync() calls to return undefined, fixing use with node-fetch without any patches.

This may be helpful to Federation users as a tweak to the above gist.

See: https://github.com/yaacovCR/graphql-tools-fork/blob/master/src/links/createServerHttpLink.ts#L15-L63

Feedback/suggestions welcome.

@kojuka
Copy link

kojuka commented May 7, 2020

@bl42 Thanks for the gist. I think its working for me. At least the files are being uploaded correctly now.

However, I'm running into this an error on this line:

const body = await this.didReceiveResponse(httpResponse, httpRequest);

Here is the error:

TypeError: _this.didReceiveResponse is not a function"

The file still gets uploaded, but the response comes back with an error.

Any thoughts on this?

@yaacovCR
Copy link

yaacovCR commented May 7, 2020

Fyi, graphql-tools-fork has been merged into graphql-tools v5 and may help you stitch all the things.

Fixed FormData is also available there.

@kojuka
Copy link

kojuka commented May 7, 2020

I got it working by making this change:

from:

const body = await this.didReceiveResponse(httpResponse, httpRequest);

to:

let body = await this.parseBody(httpResponse);

Now the gateway is uploading correctly and returning the right response.

Here is my working gist for anyone else who runs into this:
https://gist.github.com/kojuka/b39d2e3991c0528a1db7d7ed349bf319

Thanks!

@ryanvade
Copy link

ryanvade commented Nov 9, 2020

This seems to be broken now with Node v14 via apollographql/apollo-server#3508

@abernix abernix transferred this issue from apollographql/apollo-server Jan 15, 2021
@doanthai
Copy link

Some help from here: https://medium.com/profusion-engineering/file-uploads-graphql-and-apollo-federation-c5a878707f4c

@sijonelis
Copy link

This seems to be broken now with Node v14 via apollographql/apollo-server#3508

apollographql/apollo-server#3508 (comment) this link describes on how to update the graphql-upload dep that is causing the issue (the one used in apollo-server is 10! major versions behind.

The gists here would need some tweaking to work with this change though...

@glasser
Copy link
Member

glasser commented May 26, 2021

Yes, we could not upgrade the built in version of graphql-upload further due to backwards compatibility concerns. (Because of this, we are removing the direct integration from Apollo Server 3 and allowing you to integrate the version of your choice yourself instead.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests