-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-31648 New StSizePeakEphemeralDisk and StSizePeakTempDisk for sort #18573
Conversation
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-31648 Jirabot Action Result: |
4c59da4
to
59a2343
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - approving, but see comment re. offset_t.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - I think looks good, but it covers Local sort, but only partially covers global sort.
Global sort also spills in tsorts.cpp - CWriteIntercept::write.
I think it would be worth covering that case too, otherwise results are going to be misleading when there's a global sort involved.
thorlcr/thorutil/thmem.cpp
Outdated
rows.kill(); // no longer needed, readers will pull from spillFile. NB: ok to kill array as rows is never written to or expanded | ||
spillFile->noteSize(iFile->size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When faster updates are revisited, may want to move this into save with a lower granularity (e.g. update ever second or every 1k records, within save).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Jira created for reminded: https://hpccsystems.atlassian.net/browse/HPCC-32048
4949830
to
9479eb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - 1 issue.
thorlcr/msort/tsorts.cpp
Outdated
@@ -91,6 +91,7 @@ class CWriteIntercept : public CSimpleInterface | |||
if (idxFileStream.get()) | |||
{ | |||
idxFileStream->write(sizeof(o),&o); | |||
idxFile->noteSize(idxFileIO->getStatistic(StSizeDiskWrite)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writeidxofs is going to be called per row, this should be moved to after the idxFileStream->flush() (~ line 268)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - looks good, please squash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser looks good. One question about noteSize(). A couple of style comments, but no expectation for change - especially not this PR.
fileSize = size; | ||
if (fileSizeTracker) | ||
fileSizeTracker->growSize(fileSize); | ||
if (fileSizeTracker && fileSize!=size) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this still be setting fileSize if fileSizeTracker is null?
thorlcr/thorutil/thmem.cpp
Outdated
@@ -1619,6 +1619,7 @@ class CThorRowCollectorBase : public CSpillable | |||
protected: | |||
CThorSpillableRowArray spillableRows; | |||
IPointerArrayOf<CFileOwner> spillFiles; | |||
Linked<CFileSizeTracker> tempFileSizeTracker; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: Don't really need to save because it can be accessed via activity member.
thorlcr/msort/tsorts.cpp
Outdated
idxFile.setown(createIFile(tempname.str())); | ||
idxFileIO.setown(idxFile->open(IFOcreaterw)); | ||
Owned<IFile> tmpFile = createIFile(tempname.str()); | ||
idxFile.setown(new CFileOwner(tmpFile, activity.queryTempFileSizeTracker())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be tempted to add a helper function so this could be
idxFile.setown(activity.createOwnedTempFile(tmpFile))
or even
idxFile.setown(activity.createOwnedTempFile(tempname))
To clarify the code, reduce duplication and improve encapsulation.
Not for this PR though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser please squash and I will merge into the next rc
Signed-off-by: Shamser Ahmed <[email protected]> HPCC-31648 Changes following review Signed-off-by: Shamser Ahmed <[email protected]>
Type of change:
Checklist:
Smoketest:
Testing: