-
Notifications
You must be signed in to change notification settings - Fork 304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HPCC-30588 Publish descriptions for each of the statistic types #18299
Conversation
https://track.hpccsystems.com/browse/HPCC-30588 |
Test by using the url: |
Signed-off-by: Gavin Halliday <[email protected]>
053279c
to
a14c681
Compare
@JamesDeFabia please check the language |
system/jlib/jstats.cpp
Outdated
{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" }, | ||
{ NUMSTAT(RowsProcessed), "How many rows have been processed" }, | ||
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" }, | ||
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: missing period prior to new line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An activity is active if this does not match NumStop
not sure worth expanding on, but mismatch might indicate an abnormal termination state vs active.
system/jlib/jstats.cpp
Outdated
{ PERSTAT(Replicated), "The percentage replication complete" }, | ||
{ NUMSTAT(DiskRowsRead), "The number of rows read from the file" }, | ||
{ NUMSTAT(IndexRowsRead), "The number of rows read from the index" }, | ||
{ NUMSTAT(DiskAccepted), "The number of disk rows that resturn a result from the transform" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typos resturn
system/jlib/jstats.cpp
Outdated
{ TIMESTAT(SpillElapsed), "Time spent spilling rows from memory to disk" }, //MORE: Do we have a similar stat for SpillRead? | ||
{ TIMESTAT(SortElapsed), "Time spent sorting rows in memory" }, | ||
{ NUMSTAT(Groups), "The number of groups processed by this activity" }, | ||
{ NUMSTAT(GroupMax), "The size of the largest group processed by this activity.\nA skew in group size can cause a skew in processing time. A large skew may indicate some special values would benefit from special casing." }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: period at end of line is inconsistent.
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(ScansPerRow), UNUSED }, | ||
{ NUMSTAT(Allocations), "The number of allocations from the row memory" }, | ||
{ NUMSTAT(AllocationScans), "The number of scans within the memory manager when allocating row memory\nOnly applies to the scanning heap manager (not used by default)" }, | ||
{ NUMSTAT(DiskRetries), "The number of times an io operation was retried.\nIf this is non zero it may suggest a problem with the underlying disk storage" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io
-> IO
(Similar to OS)?
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(SysContextSwitches), "The number of context switches that occurred when processing" }, | ||
{ TIMESTAT(OsUser), "Total elapsed user-space time" }, | ||
{ TIMESTAT(OsSystem), "Total time spent in the system/kernel" }, | ||
{ TIMESTAT(OsTotal), "Total elapsed time according to the OS.\nIncludes system,user,idle and iowait times" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: spaces after commas?
system/jlib/jstats.cpp
Outdated
{ WHENFIRSTSTAT(Dequeued), "The time when this item was removed from a queue" }, | ||
{ WHENFIRSTSTAT(K8sLaunched), "The time when the job to procss this item was launched" }, | ||
{ WHENFIRSTSTAT(K8sStarted), "The time when the job to procss this item started executing/nThe difference between the K8sStarted and K8sLaunched indicates how long Kubernetes took to resource and initialised the job." }, | ||
{ WHENFIRSTSTAT(K8sReady), "The time when the Thor job is ready to process\nThe difference with K8sStarted inidcates how long it took to resource and start the slave processes" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing period before newline.
system/jlib/jstats.cpp
Outdated
{ SIZESTAT(RowMemory), "The size of memory used to store rows" }, | ||
{ SIZESTAT(PeakRowMemory), "The peak memory used to store rows" }, | ||
{ SIZESTAT(AgentSend), "The size of data sent to the agent from the server" }, | ||
{ TIMESTAT(IndexCacheBlocked), "The time spend waiting to access the index page cache" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spend
-> spent
?
system/jlib/jstats.cpp
Outdated
{ CYCLESTAT(IndexCacheBlocked) }, | ||
{ TIMESTAT(AgentProcess) }, | ||
{ TIMESTAT(AgentProcess), "The total time spend by the agents processing requests" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spend
-> spent
?
{ SIZESTAT(ContinuationData) }, | ||
{ NUMSTAT(ContinuationRequests) }, | ||
{ NUMSTAT(AckRetries), "How many times the server fails to receive a response from an agent within the expected time" }, | ||
{ SIZESTAT(ContinuationData), "The total size of continuation data sent from agent to the server\nA large number may indicate a poor filter, or merging from many different index locations" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing period before newline.
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(ContinuationRequests) }, | ||
{ NUMSTAT(AckRetries), "How many times the server fails to receive a response from an agent within the expected time" }, | ||
{ SIZESTAT(ContinuationData), "The total size of continuation data sent from agent to the server\nA large number may indicate a poor filter, or merging from many different index locations" }, | ||
{ NUMSTAT(ContinuationRequests), "The number of time the agent indicated there was more data to be returned" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
time
-> times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments inline. NB: I only reviewed the language in jstats.cpp.
system/jlib/jstats.cpp
Outdated
{ SIZESTAT(GeneratedCpp), "The size of the generated c++ file" }, | ||
{ SIZESTAT(PeakMemory), "The peak memory used while processing this item" }, | ||
{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" }, | ||
{ NUMSTAT(RowsProcessed), "How many rows have been processed" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency: The number of rows processed
system/jlib/jstats.cpp
Outdated
{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" }, | ||
{ NUMSTAT(RowsProcessed), "How many rows have been processed" }, | ||
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" }, | ||
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency: The number of times the activity has started executing...
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(RowsProcessed), "How many rows have been processed" }, | ||
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" }, | ||
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" }, | ||
{ NUMSTAT(Stops), "How many times the activity has stopped executing.\nAn activity is active if this is less than NumStarts" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency: The number of times the activity has stopped executing.
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(PreloadCacheHits), UNUSED }, | ||
{ NUMSTAT(PreloadCacheAdds), UNUSED }, | ||
{ NUMSTAT(ServerCacheHits), UNUSED }, | ||
{ NUMSTAT(IndexAccepted), "The number of keyed join matches that return a result from the transform" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keywords s/b All CAPS: KEYED JOIN TRANSFORM
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(PreloadCacheAdds), UNUSED }, | ||
{ NUMSTAT(ServerCacheHits), UNUSED }, | ||
{ NUMSTAT(IndexAccepted), "The number of keyed join matches that return a result from the transform" }, | ||
{ NUMSTAT(IndexRejected), "The number of keyed join matches that are skipped by the transform" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keywords s/b All CAPS: KEYED JOIN TRANSFORM
system/jlib/jstats.cpp
Outdated
{ SIZESTAT(SpillFile) }, | ||
{ NUMSTAT(DiskReads), "The number of disk read operations" }, | ||
{ NUMSTAT(DiskWrites), "The number of disk write operations" }, | ||
{ NUMSTAT(Spills), "How many times the activity spilt to disk"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency: The number of times the activity spilt to disk
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(ScansPerRow), UNUSED }, | ||
{ NUMSTAT(Allocations), "The number of allocations from the row memory" }, | ||
{ NUMSTAT(AllocationScans), "The number of scans within the memory manager when allocating row memory\nOnly applies to the scanning heap manager (not used by default)" }, | ||
{ NUMSTAT(DiskRetries), "The number of times an io operation was retried.\nIf this is non zero it may suggest a problem with the underlying disk storage" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
io s/b I/O
system/jlib/jstats.cpp
Outdated
{ COSTSTAT(Execute) }, | ||
{ SIZESTAT(AgentReply) }, | ||
{ TIMESTAT(AgentWait) }, | ||
{ COSTSTAT(Execute), "Cpu cost of executing" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cpu s/b CPU
system/jlib/jstats.cpp
Outdated
{ COSTSTAT(Compile) }, | ||
{ TIMESTAT(NodeLoad) }, | ||
{ COSTSTAT(FileAccess), "The transactional cost of any file operations" }, | ||
{ NUMSTAT(Pods), "How many pods were used" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency: The number of pods used.
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(AckRetries) }, | ||
{ SIZESTAT(ContinuationData) }, | ||
{ NUMSTAT(ContinuationRequests) }, | ||
{ NUMSTAT(AckRetries), "How many times the server fails to receive a response from an agent within the expected time" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency: The number of times...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ghalliday - looks good.
Some pedantic issues mostly, the present/past-tense of the sentences are a bit inconsistent some of the time - I think it would be better if always past tense.
{ WHENFIRSTSTAT(Created), "The time when an item was created" }, | ||
{ WHENFIRSTSTAT(Compiled), "The time a workunit started being compiled" }, | ||
{ WHENFIRSTSTAT(WorkunitModified), UNUSED }, | ||
{ TIMESTAT(Elapsed), "The elapsed time between starting and finishing\nFor child queries this may be significantly larger than TimeTotalExecute" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing full stop before \n for consistency.
But for ultimate consistency, all single line descriptions should also have a terminating full stop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was aiming for no fullstops since these are likely to appear in tooltips (so I would remove the ones above). @GordonSmith what would you prefer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Primarily consistency, I don't really mind either way.
{ WHENFIRSTSTAT(Compiled), "The time a workunit started being compiled" }, | ||
{ WHENFIRSTSTAT(WorkunitModified), UNUSED }, | ||
{ TIMESTAT(Elapsed), "The elapsed time between starting and finishing\nFor child queries this may be significantly larger than TimeTotalExecute" }, | ||
{ TIMESTAT(LocalExecute), "The time spent executing this activity not including its inputs\nSort activities by local execute time to help isolate potential processing bottlenecks" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing full stop before \n
system/jlib/jstats.cpp
Outdated
{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" }, | ||
{ NUMSTAT(RowsProcessed), "How many rows have been processed" }, | ||
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" }, | ||
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An activity is active if this does not match NumStop
not sure worth expanding on, but mismatch might indicate an abnormal termination state vs active.
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(IndexScans), "The number of index scans.\nHow many entries are sequentially examined after an initial seek (including wild seeks). Large numbers compared to the number of seeks may indicate extra keyed filters would be worthwhile" }, | ||
{ NUMSTAT(IndexWildSeeks), "The number of seeks caused by WILD() filters.\nThe number of keyed lookups that had to search for the next potential match. If this is a high proportion of NumIndexScans it may suggest poor key design" }, | ||
{ NUMSTAT(IndexSkips), "The number of smart-stepping operations that increment the next match" }, | ||
{ NUMSTAT(IndexNullSkips), "The number of smart-stepping operations that have no effect.\nIf this is large compare to NumIndexSkips it suggests the priority may not be set correctly" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that have
that had?
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(PostFiltered), "The number of index matches filtered by the transform and the non-keyed filter" }, | ||
{ NUMSTAT(BlobCacheHits), "The number of times a blob is resolved in the cache" }, | ||
{ NUMSTAT(LeafCacheHits), "The number of times a leaf node is resolved in the cache" }, | ||
{ NUMSTAT(NodeCacheHits), "The number of times a branch node is resolved in the cache" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tense: "is resolved" -> "was resolved"?
NB: below it is past tense : "was read"
{ NUMSTAT(AttribsSimplified), UNUSED }, | ||
{ NUMSTAT(AttribsFromCache), UNUSED }, | ||
{ NUMSTAT(SmartJoinDegradedToLocal), "The number of times a global smart-join switched to a LOCAL JOIN (with distribute)" }, | ||
{ NUMSTAT(SmartJoinSlavesDegradedToStd), "The number of times a global smart-join degraded to a standard join" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it worth saying, after newline. That the above 2 stats will either be 0 or 1, unless in a LOOP?
system/jlib/jstats.cpp
Outdated
{ TIMESTAT(AgentWait) }, | ||
{ COSTSTAT(Execute), "Cpu cost of executing" }, | ||
{ SIZESTAT(AgentReply), "Size of data sent from the workers to the agent" }, | ||
{ TIMESTAT(AgentWait), "Time that the agent spends waiting for a reply from the workers" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spends->spent ?
system/jlib/jstats.cpp
Outdated
{ CYCLESTAT(NodeLoad) }, | ||
{ TIMESTAT(LeafLoad) }, | ||
{ TIMESTAT(LeafLoad), "Time spent reading leaf nodes from disk and decompressing them\nIf this is a high proportion of the time (especially compared to TimeLeadRead) then consider using the new index compression formats" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TimeLeadRead -> TimeLeafRead
const char * queryStatisticDescription(StatisticKind kind) | ||
{ | ||
StatisticKind rawkind = (StatisticKind)(kind & StKindMask); | ||
if (rawkind >= StKindNone && rawkind < StMax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
result will be same I think either way, but should this be rawKind > StKindNone ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is consistent with the other functions.
Signed-off-by: Gavin Halliday <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made changes (and a few others). I have removed all periods, but I can add them back if that is preferred.
{ WHENFIRSTSTAT(Created), "The time when an item was created" }, | ||
{ WHENFIRSTSTAT(Compiled), "The time a workunit started being compiled" }, | ||
{ WHENFIRSTSTAT(WorkunitModified), UNUSED }, | ||
{ TIMESTAT(Elapsed), "The elapsed time between starting and finishing\nFor child queries this may be significantly larger than TimeTotalExecute" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was aiming for no fullstops since these are likely to appear in tooltips (so I would remove the ones above). @GordonSmith what would you prefer?
const char * queryStatisticDescription(StatisticKind kind) | ||
{ | ||
StatisticKind rawkind = (StatisticKind)(kind & StKindMask); | ||
if (rawkind >= StKindNone && rawkind < StMax) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is consistent with the other functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments inline
{ NUMSTAT(IndexWildSeeks), "The number of seeks caused by WILD() filters\nThe number of keyed lookups that had to search for the next potential match. If this is a high proportion of NumIndexScans it may suggest poor key design" }, | ||
{ NUMSTAT(IndexSkips), "The number of smart-stepping operations that increment the next match" }, | ||
{ NUMSTAT(IndexNullSkips), "The number of smart-stepping operations that had no effect\nIf this is large compare to NumIndexSkips it suggests the priority may not be set correctly" }, | ||
{ NUMSTAT(IndexMerges), "The number of merges set up when smart stepping"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consistency: other instances of smart-stepping are hyphenated
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(IndexRowsRead), "The number of rows read from the index" }, | ||
{ NUMSTAT(DiskAccepted), "The number of disk rows that return a result from the TRANSFORM" }, | ||
{ NUMSTAT(DiskRejected), "The number of disk rows that are skipped by the TRANSFORM" }, | ||
{ TIMESTAT(Soapcall), "The time taken to executing a SOAPCALL" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
executing s/b execute
system/jlib/jstats.cpp
Outdated
{ NUMSTAT(ScansPerRow), UNUSED }, | ||
{ NUMSTAT(Allocations), "The number of allocations from the row memory" }, | ||
{ NUMSTAT(AllocationScans), "The number of scans within the memory manager when allocating row memory\nOnly applies to the scanning heap manager (not used by default)" }, | ||
{ NUMSTAT(DiskRetries), "The number of times an I/O operation was retried\nIf this is non zero it may suggest a problem with the underlying disk storage" }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-zero s/b hyphenated
Signed-off-by: Gavin Halliday <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good from my POV. (I accept smart stepping without a hyphen when not used as an adjective. )
Type of change:
Checklist:
Smoketest:
Testing: