-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-32880 Ensure that legacy write costs are not lost after file read operation #19286
Conversation
Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32880 Jirabot Action Result: |
02c76e5
to
ad901a7
Compare
@shamser I think you need to rebase to fix that build problem. |
@jakesmith I've rebased. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - please see comments
dali/base/dadfs.cpp
Outdated
if (attrs) | ||
{ | ||
if (hasReadWriteCostFields(*attrs)) | ||
if (!attrs->hasProp(getDFUQResultFieldName(DFUQRFreadCost))&& !isFileKey(*attrs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be clearer if reversed :
if (attrs->hasProp(getDFUQResultFieldName(DFUQRFreadCost))
readCost = attrs->getPropInt64(getDFUQResultFieldName(DFUQRFreadCost));
else if (!isFileKey(*attrs)) // cannot reliably calculate cost of index read based on num reads
{
numDiskReads = attrs->getPropInt64(getDFUQResultFieldName(DFUQRFnumDiskReads));
doLegacyCostCalc = true;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a comment re. why !isFileKey , e.g. // cannot reliably calculate cost of index read based on num reads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or, could this code be using getLegacyReadCost (and getLegacyWriteCost below) - so these conditions/comments are common?
dali/base/dadfs.cpp
Outdated
} | ||
else | ||
readCost = attrs->getPropInt64(getDFUQResultFieldName(DFUQRFreadCost)); | ||
|
||
if (!attrs->hasProp(getDFUQResultFieldName(DFUQRFwriteCost))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, would be clearer if condition reversed.
dali/base/dadfs.cpp
Outdated
cost_type writeCost = file->getPropInt64(getDFUQResultFieldName(DFUQRFwriteCost)); | ||
if (!writeCost) | ||
{ | ||
cost_type writeCost = getLegacyWriteCost(*file, nodeGroup); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is setting a scoped 'writeCost' atm, I think this should be:
writeCost = getLegacyWriteCost(*file, nodeGroup);
f6eec69
to
6f2e1fb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - please see comments.
} | ||
else | ||
{ | ||
// not sure if src superfile can have whichInput==-1 (but if so, this is best effort to calc cost) | ||
totalReadCost += calcFileAccessCost(distributedSource, 0, curProgress.numReads); | ||
} | ||
} | ||
return updateCostAndNumReads(distributedSource, totalNumReads, totalReadCost); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably reasonable to set these super props at this time, but they're going to be wrong if/when the super file is manipulated, e.g. a new subfile added or removed.
I think updateFileAttrs needs to be updated to cope with that.
They're also going to be wrong after o single subfile in an existing super is read ? i.e. what would update the super cost under those circumstances?
When the engines (e.g. thor) reads a super, it updates the sub file read costs in CMasterActivity::calcFileReadCostStats, I can't see that it updates the super costs though?
This PR may be okay as it is, but looks like there are problems with super cost info going stale in general?
@shamser - please comment, and if appropriate, open a new JIRA, reference here, and link it to this JIRA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A jira to revisit super file costs: https://hpccsystems.atlassian.net/browse/HPCC-32901
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shamser - I think we can merge this now, but should revisit the superfile issues (https://hpccsystems.atlassian.net/browse/HPCC-32901) soon.
Please squash.
…d operation. Signed-off-by: Shamser Ahmed <[email protected]>
Jirabot Action Result: |
Type of change:
Checklist:
Smoketest:
Testing: