HPCC-31968 Increase ECLWatch upload buffer size to 1MB #18721

asselitx · 2024-05-31T17:01:45Z

Get a quick reasonable fix deployed. Former 1K size likely cause of slow uploads to cloud, if not increased cost in some cases. Subsequent ticket will use configured preferred size per landing zone.

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

Tested locally with instrumentation to confirm upload completes in fewer read/write cycles

asselitx · 2024-05-31T17:04:56Z

@ghalliday let me know if someone else is better suited to review. Tim is out or I would have asked him.

github-actions · 2024-05-31T17:16:04Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-31968

Jirabot Action Result:
Workflow Transition: Merge Pending
Updated PR

ghalliday

@asselitx see comments and suggestion. Please ask me if you have any questions.

ghalliday · 2024-06-03T08:48:30Z

esp/bindings/http/platform/httptransport.cpp

@@ -2123,8 +2123,8 @@ int CHttpRequest::processHeaders(IMultiException *me)

 bool CHttpRequest::readContentToBuffer(MemoryBuffer& buffer, __int64& bytesNotRead)
 {
-    char buf[1024 + 1];
-    __int64 buflen = 1024;
+    char buf[1048576 + 1];


Allocating 1MB on the stack could cause problems.

The function also has a couple of other problems:

It is unnecessarily memcpy-ing the data - which adds up if the file is large.

It is unnecessarily adding a null terminator onto the string that was read (which makes the code confusing).

Better is something like:

constexpr size32_t readChunkSize = 0x100000; size32_t sizeToRead = bytesNotRead > readChunkSize ? readChunkSize: (size32_t)readChunkSize; size32_t prevLen = buffer.length(); char * target = (char *)buffer.reserve(sizeToRead); int readlen = m_bufferedsocket->read(target, sizeToRead); if(readlen <= 0) { if(readlen < 0) DBGLOG("Failed to read from socket"); buffer.setLength(prevLen); return false; } buffer.setLength(prevLen + readlen); bytesNotRead -= readlen; return true;

(Untested). It reads the data directly into the buffer rather than copying it.

This makes sense, it is a much better solution. I'd completely missed the consideration of stack size and the reserve on the buffer is slick. I'm implementing this and testing.

In HttpTransport, update readContentToBuffer function to read up to 1MB chunks directly into the MemoryBuffer rather than into a temporary stack buffer. Former 1K size likely cause of slow uploads to cloud, if not increased cost in some cases. Subsequent ticket will use configured preferred size per landing zone. Signed-off-by: Terrence Asselin <[email protected]>

ghalliday · 2024-06-07T11:49:13Z

esp/bindings/http/platform/httptransport.cpp

+    size32_t sizeToRead = bytesNotRead > readChunkSize ? readChunkSize: (size32_t)bytesNotRead;
+    size32_t prevLen = buffer.length();
+
+    // BufferedSocket::read buffer must be at least one larger than its maxlen argument


Well spotted (and that is a terrible interface!)

asselitx requested a review from ghalliday May 31, 2024 17:03

asselitx force-pushed the slow-1mb-hpcc-31968 branch from fdcb624 to f33a58d Compare May 31, 2024 19:36

ghalliday requested changes Jun 3, 2024

View reviewed changes

asselitx force-pushed the slow-1mb-hpcc-31968 branch from f33a58d to 2f399ca Compare June 5, 2024 21:38

asselitx requested a review from ghalliday June 5, 2024 21:38

ghalliday approved these changes Jun 7, 2024

View reviewed changes

ghalliday merged commit fea585e into hpcc-systems:candidate-9.6.x Jun 7, 2024
51 of 52 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-31968 Increase ECLWatch upload buffer size to 1MB #18721

HPCC-31968 Increase ECLWatch upload buffer size to 1MB #18721

asselitx commented May 31, 2024 •

edited

Loading

asselitx commented May 31, 2024

github-actions bot commented May 31, 2024

ghalliday left a comment

ghalliday Jun 3, 2024

asselitx Jun 5, 2024

ghalliday Jun 7, 2024

HPCC-31968 Increase ECLWatch upload buffer size to 1MB #18721

HPCC-31968 Increase ECLWatch upload buffer size to 1MB #18721

Conversation

asselitx commented May 31, 2024 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

asselitx commented May 31, 2024

github-actions bot commented May 31, 2024

ghalliday left a comment

Choose a reason for hiding this comment

ghalliday Jun 3, 2024

Choose a reason for hiding this comment

asselitx Jun 5, 2024

Choose a reason for hiding this comment

ghalliday Jun 7, 2024

Choose a reason for hiding this comment

asselitx commented May 31, 2024 •

edited

Loading