GH-40036: [C++] Azure file system write buffering & async writes #43096

OliLay · 2024-07-01T14:53:28Z

Rationale for this change

What changes are included in this PR?

Write buffering and async writes (similar to what the S3 file system does) in the ObjectAppendStream for the Azure file system.

With write buffering and async writes, the input scenario creation runtime in the tests (which uses the ObjectAppendStream against Azurite) decreased from ~25s (see here) to ~800ms:

[ RUN      ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt
[       OK ] TestAzuriteFileSystem.OpenInputFileMixedReadVsReadAt (787 ms)

Are these changes tested?

Added some tests with background writes enabled and disabled (some were taken from the S3 tests). Everything changed should be covered.

Are there any user-facing changes?

AzureOptions now allows for background_writes to be set (default: true). No breaking changes.

Notes

The code in DoWrite is very similar to the code in the S3 FS. Maybe this could be unified? I didn't see this in the scope of the PR though.

GitHub Issue: [C++][FS][Azure] Optimise ObjectAppendStream::DoAppend in the case of many small appends #40036

github-actions · 2024-07-01T14:53:54Z

⚠️ GitHub issue #40036 has been automatically assigned in GitHub to PR creator.

OliLay · 2024-07-18T14:10:23Z

Hi @kou, sorry for directly pinging you again, but do you maybe know who would be an appropriate person to look at this PR? Same for #43098

kou · 2024-07-19T01:13:46Z

No problem. Sorry for not reviewing this and #43098.
@Tom-Newton @felipecrv and I implemented Azure filesystem. So one of us are appropriate person to ping.

kou

The code in DoWrite is very similar to the code in the S3 FS. Maybe this could be unified? I didn't see this in the scope of the PR though.

It's better if it improve maintainability. We can work on it as a follow-up task.

kou · 2024-07-19T01:19:10Z

cpp/src/arrow/filesystem/azurefs.cc

+  }
+
+  Future<> FlushAsync() {
+    RETURN_NOT_OK(CheckClosed("flush"));


Can we move

arrow/cpp/src/arrow/filesystem/azurefs.cc

Lines 1063 to 1067 in 0bae073

RETURN_NOT_OK(CheckClosed("flush"));

if (!initialised_) {

// If the stream has not been successfully initialized then there is nothing to

// flush. This also avoids some unhandled errors when flushing in the destructor.

return Status::OK();

from Flush()?
It seems that we don't need to execute CheckClosed("flush") and if (!initialized_) in both of Flush() and FlushAsync().

I moved this to Flush, since this is the public API.

I think I agree with @kou. Also the S3 filesystem implements Flush() as

Status Flush() override { auto fut = FlushAsync(); return fut.status(); }

I think it would be nice if we could do the same because it ensures that Flush and FlushAsync have the same behaviour.

I tried to unify the sync/async flush implementations, unfortunately that did not work out due to lifetime issues in the async case. When an ObjectAppendStream is deconstructed in RAII way, Close() (and therefore Flush()) is called. If we call FlushAsync() in the close, we need to create a shared_ptr from this (to ensure lifetime of this when the lambda is actually called), but: we can not create a shared_ptr of this while it is deconstructed. Hence in the Close() call we always must do a sync Flush where we do not have to give these lifetime guarantees.
TLDR: Flush() and FlushAsync() impls are similar, but slightly different and decoupled implementations.

kou · 2024-07-19T01:22:38Z

cpp/src/arrow/filesystem/azurefs.cc

+    auto advance_ptr = [&data_ptr, &nbytes](const int64_t offset) {
+      data_ptr += offset;
+      nbytes -= offset;
+    };


How about updating pos_ and content_length_ too in this and improve variable name for the change?

kou · 2024-07-19T01:32:49Z