Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manual chunk upload for GCS #2480

Open
AmbroiseCouissin opened this issue Aug 1, 2023 · 11 comments
Open

Manual chunk upload for GCS #2480

AmbroiseCouissin opened this issue Aug 1, 2023 · 11 comments
Assignees
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@AmbroiseCouissin
Copy link

Hello! Thank you for the work on this SDK as it has been all working perfectly for now.

We are trying to manually upload chunks the following way:

  1. Client calls our backend service that, in turn, calls the APIs of Google.Cloud.Storage.V1 to initialize a chunked upload of a large file (multiple GBs). For this we use client.CreateObjectUploader and then await uploader.InitiateSessionAsync. We therefore get a session URI.
using MemoryStream emptyMemoryStream = new();
Google.Apis.Storage.v1.ObjectsResource.InsertMediaUpload uploader = client.CreateObjectUploader(command.BucketName, command.Key, GetMimeType(command.Key), emptyMemoryStream);

Uri uploadUri = await uploader.InitiateSessionAsync(cancellationToken);
  1. Client manually upload chunks of data, 1 by 1, to our backend that, in turn, calls the APIs of Google.Cloud.Storage.V1 .
    Every time the client uploads a chunk of data, we do the following:
ResumableUpload actualUploader = ResumableUpload.CreateFromUploadUri(new Uri(request.ResumableUrl), new MemoryStream(Convert.FromBase64String(request.ContentAsBase64)));

IUploadProgress uploadProgress = await actualUploader.ResumeAsync(new Uri(request.ResumableUrl), cancellationToken);

Unfortunately, everytime a new chunk is uploaded, it replaces the one before that. We are not sure what we are missing as we have tried a few ways to do that.

Can you please let us know if you have any idea of what we are doing wrong?

@AmbroiseCouissin AmbroiseCouissin added priority: p3 Desirable enhancement or fix. May not be included in next release. type: question Request for information or clarification. Not an issue. labels Aug 1, 2023
@jskeet jskeet self-assigned this Aug 1, 2023
@jskeet
Copy link
Collaborator

jskeet commented Aug 1, 2023

I'll look at this when I get a chance, but it's unlikely to be today, I'm afraid. It's been a while since I've looked at "manual" resumable uploads so I'll need to get all the context again.

@AmbroiseCouissin
Copy link
Author

@jskeet
What would you suggest I do if I needed to implement this soon?
I looked inside the code of the library and I'm not sure this is supported (setting start, end and size of chunks...).
For the sake of time, should I try to implement this myself without the SDK and maybe post the code here?

@jskeet
Copy link
Collaborator

jskeet commented Aug 1, 2023

If by "soon" you mean you need to do this today, before I have a chance to look into this, then yes, you could look into doing it manually. But I don't remember enough about the protocol involved to give any more advice.

@AmbroiseCouissin
Copy link
Author

I need to have this implemented by end of week. I'll give it a go :)

Thanks for the prompt response!

@jskeet
Copy link
Collaborator

jskeet commented Aug 1, 2023

Just to check your use case, am I right in saying that you basically want to create a new object, upload lots of chunks separately (potentially from different servers?) and then finalize the object later? (I don't believe you can leave it in a perpetual "keep appending" state, but I could be wrong.)

One option to consider by the way is using the "compose" operation - that's not directly exposed in StorageClient, but if you use the Service property you can get at the underlying StorageService from Google.Apis.Storage.v1 - that way you could upload each chunk as a separate object, and compose them all at the end. I wouldn't suggest that as a permanent solution, but it might be a simple temporary workaround until we've had time to get the write-multiple-chunks-to-a-single-object option working.

@AmbroiseCouissin
Copy link
Author

@jskeet Yes, I want to upload from chunks the following way:

client frontend --- send chunk 1 ---> our backend (.net core) --- send chunk 1 ---> google cloud storage
client frontend --- send chunk 2 ---> our backend (.net core) --- send chunk 2 ---> google cloud storage

The reason is that the file we'd like to upload is very voluminous and our backend would have Out of Memory exceptions if client uploads the entire file.

The "compose" solution you are proposing would work as workaround :) I'll do this for now.
Thanks for the tip!

@jskeet
Copy link
Collaborator

jskeet commented Aug 1, 2023

On a mobile, so briefly - you call Execute or ExecuteAsync on the request.

I suspect that client.Service.Objects.Compose(...) is a simpler way to get a request, too - but what you've got should work.

@hemanshv
Copy link

hemanshv commented Aug 1, 2023

+1 to what @jskeet has said. You can use https://cloud.google.com/storage/docs/composing-objects#create-composite-client-libraries as a reference.

@AmbroiseCouissin
Copy link
Author

@jskeet @hemanshv Yes, I missed the ExecuteAsync() part. Thanks!

@jskeet
Copy link
Collaborator

jskeet commented Aug 4, 2023

Okay, I've had a look now, and the Upload code always assumes it can tell the server that it's "done" when it reaches the end of the stream. Changing that to allow "upload but don't finalize" may be a significant amount of work - I'm not sure yet. Just working out the best API surface for it is at least somewhat challenging. I'll consult with colleagues next week about how we prioritize this feature request - in the meantime, I hope the workaround is working for you.

@jskeet
Copy link
Collaborator

jskeet commented Aug 22, 2023

Quick update: we've spoken with the GCS team, and while there's only one other client library that currently exposes this functionality (partial resumable upload) it is a feature that the GCS team would like to see. No promises on an implementation timeframe, but we'll include it in our planning considerations.

@amanda-tarafa amanda-tarafa added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: question Request for information or clarification. Not an issue. labels Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p3 Desirable enhancement or fix. May not be included in next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

5 participants