Proposal: object storage building block #18

ItalyPaleAle · 2022-12-14T19:29:22Z

Consolidating and updating the proposals for a new object storage building block: dapr/dapr#4804, dapr/dapr#4934

Proposed the creation of a new "object storage" building block which allows storing objects of unstructured data and arbitrary size, and it's optimized for reading/writing with streams.

Signed-off-by: ItalyPaleAle <[email protected]>

artursouza

I really like this proposal.

0010-B-object-storage-building-block.md

artursouza · 2022-12-14T19:47:34Z

0010-B-object-storage-building-block.md

+To retrieve an object, clients make a GET request:
+
+```text
+GET http://localhost:<port>/v1.0-alpha1/objects/<component>/<key>


So, we would call it the "objects" API and not "blob"? I think "blob" is clear about the unstructured nature of it while "object" reminds me of JSON. It might just be me but food for thought.

I had the same doubt. I landed with "object" because the industry-standard term seems to be "object storage", but I agree that "blob" does sound better. I'm fine either way.

It would be great for people to vote on the naming for this here.

I think that the points you 2 just mentioned here should be part of the overview: about both blob and objects being commonly used to refer to the concept at hand and that the industry usually uses "object".

I will say that blob makes it non-ambiguous to me to what you are targeting with this building block.

Arguably "Object storage" contains more than just a binary blob, there's often metadata attached to it -- AWS and Google whereas blob storage usually refers to an opaque large binary object so I think naming matters; are we purely offering the opaque data or something that requires metadata be associated with it?

Metadata can be added (as described in the doc) but isn't required.

I'd go for object storage here since metadata while optional can play a major part in how the API behaves, and this supports the same object storage notion for AWS and GCP which has metadata as an optional construct.

Then I would agree that it feels more like an object store than a blob store, which is fine because objects can be blobs (i.e no metadata associated), but blobs can't really be objects by their accepted definitions.

yaron2 · 2022-12-14T21:13:30Z

0010-B-object-storage-building-block.md

+To retrieve an object, clients make a GET request:
+
+```text
+GET http://localhost:<port>/v1.0-alpha1/objects/<component>/<key>


I'd go for object storage here since metadata while optional can play a major part in how the API behaves, and this supports the same object storage notion for AWS and GCP which has metadata as an optional construct.

yaron2 · 2022-12-14T21:14:58Z

0010-B-object-storage-building-block.md

+* Updated: 2022-12-14
+* Original proposals / discussions: dapr/dapr#4808, dapr/dapr#4934
+
+## Overview


What is really missing from this proposal is the why (why users would care about this), the actual problems it helps solve and a mention of the alternatives. All these sections are indicated in the proposed template and need to be added here for clarity.

Thanks, can you mention the alternatives? I have additional comments that would best go under that section.

Besides Dapr state stores (which I sort-of mentioned), what other alternatives do you think are worth including? Are non-Dapr solutions (like libraries that abstract various object storage services) one thing I should include?

Exactly, what would a user do without Dapr (for example, going with a native SDK) is important - it was included in other proposals and it serves as a really great way to know if the value add to the user with Dapr is worth both the maintenance effort for the project and also for the users who need to run it. (Dapr has overhead and we need to make sure APIs/features we add have a considerable value add).

yaron2 · 2022-12-14T21:20:37Z

0010-B-object-storage-building-block.md

+The proposal involves creating a new `objectstore` building block with the following interface:
+
+```go
+type ObjectStore interface {


What about listing objects?

I am unsure about listing objects at this point. The problem with this is that it's a very hard API to build in a scalable, performant, and consistent way.

There could be hundreds of thousands or millions of file in a storage account / bucket, and returning them all is not practical, both for Dapr and for the service. Users will try to list all objects in a storage account or bucket (it's inevitable!) and that will cause problems.

The "obvious" solution here is to add pagination, which means we limit the number of objects we return in each list response, but that's actually pretty complicated in practice because some backends have no native support for pagination (local filesystems), and those who do, are inconsistent in the way it's implemented.

It's also very hard to do pagination in a consistent way (in the sense of data consistency), because if files are modified between pagination requests, the result is undefined.

My preference for now would be to move forward without listing objects, so this (already very important) proposal isn't delayed due to complications in design and implementation for supporting listing objects. It is something we could add later on, if we can figure out the correct way to do that.

so this (already very important) proposal isn't delayed

Very important based on what? that's a very subjective statement, I can't relate to that as a reviewer (and in this case, a voter) and any sort of urgency expressed here is redundant to the proposal.

Two things to consider here:

Proposals are separated from milestones. This can go anywhere from 1.11 to 1.14 and beyond.

Because proposals are separated from milestones, there is no requirement for every feature listed in a proposal to fit within a single milestone. Specifying the wanted interfaces serves as a decision record we can look back to and then derive milestone work for. How that actually works in practice remains to be seen as this is a new process for us.

I don't feel too strongly about listing objects and merely wanted to know if we view listing objects as an integral part of the overall experience.

Alright, it wasn't clear to me that proposals did not have to be accepted and implemented in toto.

I am still unsure how to correctly implement, and even design, an API for listing; I will need to let this sit with me for a bit and perhaps even make some POCs. But I'm happy to accept suggestions here and I can update this proposal if someone has good ideas and/or experience with designing such an API.

Also, no particular urgency (in fact, the initial proposal was made in June). Just hoping the entire proposal doesn't get stalled because of lack of clarity on the listing API.

Just hoping the entire proposal doesn't get stalled because of lack of clarity on the listing API.

That should not happen, at least from my perspective.

Regarding the list operation, I agree that this particular operation for most any object store is very costly and inefficient (and therefore largely unused except by individuals / tools doing administrative operations such as cleaning up stale data) -- a common solution is a keeping secondary index stored elsewhere in something that is queryable so I think it's okay to not add to the building block.

Listing is super important, but perhaps consider it an "optional" feature of the underlying storage? For example, local storage on disk, can list the files very quickly and cheaply, while maybe it is a burden for S3/Azure.

The "obvious" solution here is to add pagination

To avoid paging, why not make the listing a stream that you can consume at your convenience...?

yaron2 · 2022-12-14T21:24:19Z

0010-B-object-storage-building-block.md

+
+#### Storing an object
+
+To store an object, clients make a PUT request:


If a user has a state store component that is used both for KV and Blob, will a key from one API override the other? If so, this might create inconsistencies that are very hard to reason about, especially during runtime.

We should recommend users to not use the same storage accounts / buckets with multiple building blocks.

I don't see this much different than someone using for example Postgres as a Dapr state store and also interacting with the database directly using the Postgres APIs.

I see a great difference, because in a Dapr state Postgres + Direct Postgres scenario saving the same key would actually work as expected by default whereas with Dapr state S3 and Dapr object S3 you get values being overridden by default.

I am not following, sorry.

What do you mean with:

with Dapr state S3 and Dapr object S3 you get values being overridden by default.

If you're calling a Dapr S3 state store for saving state with key 1, then call a Dapr object store to store an object with key 1, a GET operation for the S3 state store would return the data saved by the object store.

Right, that's the same as using Dapr Postgres state store to save state with key 1, then using the Postgres client to query for key 1. It is working as intended IMHO. (although keep in mind that the Dapr state stores have a prefix by default, so users who do want to overwrite 1 would actually need to write prefix/1)

The documentation should not encourage users to use multiple building blocks to access the same resources.

Save state with Dapr for key 1 results in a key in the format of app-id||1. Saving state with an SDK for key 1 would result in a key of 1.

So if a user saves two different values, with and without Dapr, and performs a GET call with and without Dapr, the values would be taken from two different keys, as expected. That's not the case with Dapr state and object storage, where both operations target app-id||1 which is the same key.

The documentation should not encourage users to use multiple building blocks to access the same resources.

Best to incorporate into the proposal as a note then.

Signed-off-by: ItalyPaleAle <[email protected]>

0010-B-object-storage-building-block.md

johnewart · 2022-12-15T19:48:18Z

0010-B-object-storage-building-block.md

+
+Where `<component>` is the name of the object store component, and `<key>` is the key (name) of the object.
+
+The request body contains the raw bytes of the object, which are read by Dapr in a streamed way and are sent to the object store as soon as they are available (once again, Dapr does not buffer the data in-memory before storing it).


It may be a good idea to include the object size in this request. I am not certain about other platforms but S3, for example, requires you use the multipart upload mechanism if the object to be stored exceeds the 5GB put operation maximum, which would require knowledge of the expected size so that the component can use the right API.

I'm afraid that's not possible because knowing the object size ahead of time would require the client to have all the data already assembled in memory. This won't work for data that is generated on-the-fly and streamed to the object storage service.

When implementing this, we can err on the side of caution and always leverage multipart upload if that's the most universal solution.

I think it is possible to stream files to S3 without knowing the size of the file -- https://stackoverflow.com/questions/8653146/can-i-stream-a-file-upload-to-s3-without-a-content-length-header?rq=1

Yup, I have done it before (with the SDK), confirming it's possible.

Yes, it's possible in S3 using multipart but that requires separate permissions and implementation than a simple put operation -- if the requirement is that size is explicitly not known ahead of time then that should be clear to component developers so that, for example, the S3 implementation can default to using multipart (or at least make it obvious to the developer what limitations it will have).

I'm afraid that's not possible because knowing the object size ahead of time would require the client to have all the data already assembled in memory.

Since metadata can only be submitted as a header, if you want to compute a checksum, you'll need to traverse the file once before you send it anyway.

We could make object size an optional parameter. Caller can pass it using a Content-Length header in the request and the components can (but don’t have to) use that to make decisions.

0010-B-object-storage-building-block.md

yaron2 · 2023-12-04T14:59:02Z

@dapr/maintainers-dapr - review ping.

withinboredom · 2023-12-04T16:46:31Z

0010-B-object-storage-building-block.md

+
+- This building block will be stream-first. It will be working with input and output data as binary streams, and will not perform any transformation or encoding on the data. Thanks to using streams, there are no limits on the size of the input/output data, bypassing `MaxBodySize`.
+- Dapr only supports the "last-write-wins" concurrency pattern for object storage (unlike with state stores). This makes sense considered the intended usage of object storage, which is storing documents that usually do not change. When documents do change, using last-write-wins is consistent with how regular filesystems work, for example.
+- Dapr will not calculate and store the checksum of objects. This is because checksums must be transmitted as header, but Dapr stores data in the state store in a streaming way, so it can't compute the full checksum until the end of the stream. Apps that want to store checksums (such as the `Content-MD5` header) should compute it beforehand and submit it as metadata header.


Can metadata also be submitted via a trailer? Imagine that I have a massive file that I want to compute the checksum for, if I can only send it as a header, I must traverse the file twice, once to compute the header, then again to submit to the object store.

I think it would make sense to allow trailing metadata and/or being able to submit metadata separately.

How about adding a method to replace metadata on a blob?

Using tailers is a less-ideal option because it means you can't use streaming and need to buffer the entire blob in memory, because SDKs need to have the metadata when the call is made

withinboredom · 2023-12-04T16:49:37Z

0010-B-object-storage-building-block.md

+
+Where `<component>` is the name of the object store component, and `<key>` is the key (name) of the object.
+
+The request body contains the raw bytes of the object, which are read by Dapr in a streamed way and are sent to the object store as soon as they are available (once again, Dapr does not buffer the data in-memory before storing it).


I'm afraid that's not possible because knowing the object size ahead of time would require the client to have all the data already assembled in memory.

Since metadata can only be submitted as a header, if you want to compute a checksum, you'll need to traverse the file once before you send it anyway.

withinboredom · 2023-12-04T21:23:38Z

0010-B-object-storage-building-block.md

+
+#### Retrieving an object
+
+To retrieve an object, clients make a GET request:


I would also recommend a HEAD request being allowed if I just want to get the metadata.

I think that's a good suggestion

withinboredom · 2023-12-04T21:27:53Z

0010-B-object-storage-building-block.md

+The proposal involves creating a new `objectstore` building block with the following interface:
+
+```go
+type ObjectStore interface {


Listing is super important, but perhaps consider it an "optional" feature of the underlying storage? For example, local storage on disk, can list the files very quickly and cheaply, while maybe it is a burden for S3/Azure.

The "obvious" solution here is to add pagination

To avoid paging, why not make the listing a stream that you can consume at your convenience...?

withinboredom · 2023-12-04T21:29:33Z

0010-B-object-storage-building-block.md

+  - `204 No Content` for successful operations
+  - `400 Bad Request` if the state store doesn't exist or the key parameter is missing
+  - `409 Conflict` if the object already exists and there's an ETag mismatch
+  - `500 Internal Server Error` for all other errors


What about write-only object storage?

What do you mean?

For example, an HSM might be backing this that allows storing new key material, and reading public keys, but not deletions. Maybe the object is backed by a file system but the file is marked read-only.

I'd suggest a 403: forbidden, to handle the case where deletion isn't allowed.

I'd also recommend a 410 if the content is already deleted and no longer exists.

Support for HSM is out of scope of this building block. It would more likely be aligned with the crypto and/or secret stores building blocks.

But in general, i support adding more status codes to indicate various scenarios like the ones you describe.

Support for HSM is out of scope of this building block.

I was illustrating the error, but yeah, I agree.

withinboredom · 2023-12-04T21:33:14Z

0010-B-object-storage-building-block.md

+    // Parameter tag is optional and can be used to pass an etag or any similar concurrency-control tag.
+    // In case of conflict (e.g. etag mismatch), the method returns ErrConflict.
+    DeleteObject(ctx context.Context, key string, tag any) (err error)
+}


What about getting a public url or reference to the object? For example, S3/Azure/(probably GCP) allows for getting a pre-signed URL that can be accessed for some time. That's incredibly useful when you want to "share" objects with other systems/clients in a secure way.

I think that could be a good suggestion but I would consider it an optional method, as not sure if all blob storage services have a concept of pre-signed URLs.

withinboredom · 2023-12-04T21:35:00Z

0010-B-object-storage-building-block.md

+- The first message in the stream MUST contain all other required keys.
+- The first message in the stream MAY contain a `payload`, but that is not required.
+- Subsequent messages (any message except the first in the stream) MUST contain a `payload` and MUST NOT contain any other property.
+- The last message in the stream MUST contain a `payload` with `complete` set to `true`. That message is assumed to be the last one from the sender and no more messages are to be sent in the stream after that.


See above, but I'd also recommend being able to send any other metadata at the end. For example, content-length, content-hash, or whatever else is important to the application that can only be done once the entire file is consumed.

I will let the server populate content-hash and/or content-length.

For checksumming during upload, if the service and SDK support that, it can be done transparently by the component. Because components will most likely need to performed chunked upload, the checksum should be included in each chunk anyways.

withinboredom · 2023-12-04T21:38:54Z

0010-B-object-storage-building-block.md

+
+There can be two types of headers:
+
+- Custom ones have the `x-dapr-` prefix in requests sent between Dapr and the app. When they are stored, they generally do not have the `x-dapr-` prefix, which is removed (although this could be dependent on the implementation of the state store).


IMHO, x-dapr- headers feel like they would be for out-of-band communication between the app and sidecar, and not for application metadata. If I recall correctly, most object stores will also send headers as-is from the object store when requested externally, thus potentially leaking information, such as the fact that dapr is being used (which may be undesirable, much like wp-content urls lets any attacker know that they're dealing with WordPress immediately).

I don't have an answer here, just something to consider.

yaron2 · 2023-12-05T15:34:12Z

+1 binding

This is a good proposal and barring any new major issues I think it'd be good to move forward with this. Outstanding comments can be addressed as implementation details IMO.

daixiang0 · 2023-12-06T07:53:56Z

+1 binding

Object storage is used widely, I really wish Dapr support it. As @yaron2 said, Outstanding comments can be addressed as implementation details IMO.

WhitWaldo · 2024-06-17T06:53:15Z

Rather than putting this under the kitchen sink of the existing state management building block, I propose that this instead should be part of a new and next-generation suite of purpose-built state management building blocks.

I propose that we leave the existing building block where it is as it's a dependency for several other blocks, but split this out to be a completely separate thing conceptually, leaving the door open for a collection of other specialty-state stores to be built alongside it later.

Why a clean break? The existing key value store is confusing to new developers and quite arguably incomplete as a full-featured key/value store (e.g. no key query support). Discord is filled with people inquiring about the (discontinued, but still present and very limited) query API and every now and then someone asks about their large files mysteriously not saving because they're exceeding the request limits. It's difficult to add any new functionality to today's state store because it already supports so many storage providers that themselves are so distinctly purpose built. For developers that want that need something today, it's a fine option, but I propose that we draw a line here on the first generation state store as a whole and instead adopt a more specific second generation model going forward. Object stores don't need querying or transactions and they'll never be a primary store for actors - they're distinctly different from what we've already got, I think this is the opportune time to start fresh.

Finally, I propose that this start with the core Get/Set streaming and Delete functionality and have a pile of later optional interfaces come along that provides search/filter functionality against the metadata for those providers that support this or event notifications (potentially paired with pubsub) or lifecycle management. We can save a more blob-specific state storage (with its own purpose-built component providers like append-only blob storage or page blobs) for another purpose-built implementation with its own distinct requirements.

Proposal: object storage building block

4fc3e71

Signed-off-by: ItalyPaleAle <[email protected]>

This was referenced Dec 14, 2022

Blob storage capability in state stores: API design dapr/dapr#4934

Closed

Blob storage building block dapr/dapr#4804

Open

artursouza reviewed Dec 14, 2022

View reviewed changes

yaron2 reviewed Dec 14, 2022

View reviewed changes

Added background

20eceea

Signed-off-by: ItalyPaleAle <[email protected]>

johnewart reviewed Dec 15, 2022

View reviewed changes

0010-B-object-storage-building-block.md Show resolved Hide resolved

johnewart reviewed Dec 15, 2022

View reviewed changes

tmacam reviewed Dec 16, 2022

View reviewed changes

0010-B-object-storage-building-block.md Show resolved Hide resolved

olitomlinson mentioned this pull request Dec 17, 2022

Alias the existing state store as KV store dapr/dapr#5653

Closed

olitomlinson mentioned this pull request Mar 2, 2023

[Workflows] Ability to limit the size of inputs and outputs for workflows and activities dapr/dapr#6014

Open

ItalyPaleAle mentioned this pull request Apr 18, 2023

Proposal update: Extend service invocation to non-Dapr endpoints #20 #25

Merged

ItalyPaleAle mentioned this pull request Jul 20, 2023

Support for List Operations in Dapr State Store dapr/dapr#6636

Closed

withinboredom reviewed Dec 4, 2023

View reviewed changes


		#### Storing an object

		To store an object, clients make a PUT request:


		Where `<component>` is the name of the object store component, and `<key>` is the key (name) of the object.

		The request body contains the raw bytes of the object, which are read by Dapr in a streamed way and are sent to the object store as soon as they are available (once again, Dapr does not buffer the data in-memory before storing it).


		#### Retrieving an object

		To retrieve an object, clients make a GET request:


		There can be two types of headers:

		- Custom ones have the `x-dapr-` prefix in requests sent between Dapr and the app. When they are stored, they generally do not have the `x-dapr-` prefix, which is removed (although this could be dependent on the implementation of the state store).

Proposal: object storage building block #18

Are you sure you want to change the base?

Proposal: object storage building block #18

Conversation

ItalyPaleAle commented Dec 14, 2022 • edited Loading

artursouza left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yaron2 Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yaron2 Dec 14, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ItalyPaleAle Dec 16, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yaron2 commented Dec 4, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yaron2 commented Dec 5, 2023

daixiang0 commented Dec 6, 2023

WhitWaldo commented Jun 17, 2024 • edited Loading

ItalyPaleAle commented Dec 14, 2022 •

edited

Loading

yaron2 Dec 14, 2022 •

edited

Loading

yaron2 Dec 14, 2022 •

edited

Loading

ItalyPaleAle Dec 16, 2022 •

edited

Loading

WhitWaldo commented Jun 17, 2024 •

edited

Loading