Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-30588 Publish descriptions for each of the statistic types #18299

Merged
merged 3 commits into from
Feb 22, 2024

Conversation

ghalliday
Copy link
Member

@ghalliday ghalliday commented Feb 15, 2024

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

Copy link

@ghalliday
Copy link
Member Author

Test by using the url:
http://:8010/WsWorkunits/WuDetailsMeta.json

@ghalliday
Copy link
Member Author

@JamesDeFabia please check the language
@GordonSmith please check this provides what you need. I'll add some details to the jira.
@jakesmith please check everything!

{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" },
{ NUMSTAT(RowsProcessed), "How many rows have been processed" },
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" },
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: missing period prior to new line?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An activity is active if this does not match NumStop

not sure worth expanding on, but mismatch might indicate an abnormal termination state vs active.

{ PERSTAT(Replicated), "The percentage replication complete" },
{ NUMSTAT(DiskRowsRead), "The number of rows read from the file" },
{ NUMSTAT(IndexRowsRead), "The number of rows read from the index" },
{ NUMSTAT(DiskAccepted), "The number of disk rows that resturn a result from the transform" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos resturn

{ TIMESTAT(SpillElapsed), "Time spent spilling rows from memory to disk" }, //MORE: Do we have a similar stat for SpillRead?
{ TIMESTAT(SortElapsed), "Time spent sorting rows in memory" },
{ NUMSTAT(Groups), "The number of groups processed by this activity" },
{ NUMSTAT(GroupMax), "The size of the largest group processed by this activity.\nA skew in group size can cause a skew in processing time. A large skew may indicate some special values would benefit from special casing." },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: period at end of line is inconsistent.

{ NUMSTAT(ScansPerRow), UNUSED },
{ NUMSTAT(Allocations), "The number of allocations from the row memory" },
{ NUMSTAT(AllocationScans), "The number of scans within the memory manager when allocating row memory\nOnly applies to the scanning heap manager (not used by default)" },
{ NUMSTAT(DiskRetries), "The number of times an io operation was retried.\nIf this is non zero it may suggest a problem with the underlying disk storage" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io -> IO (Similar to OS)?

{ NUMSTAT(SysContextSwitches), "The number of context switches that occurred when processing" },
{ TIMESTAT(OsUser), "Total elapsed user-space time" },
{ TIMESTAT(OsSystem), "Total time spent in the system/kernel" },
{ TIMESTAT(OsTotal), "Total elapsed time according to the OS.\nIncludes system,user,idle and iowait times" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: spaces after commas?

{ WHENFIRSTSTAT(Dequeued), "The time when this item was removed from a queue" },
{ WHENFIRSTSTAT(K8sLaunched), "The time when the job to procss this item was launched" },
{ WHENFIRSTSTAT(K8sStarted), "The time when the job to procss this item started executing/nThe difference between the K8sStarted and K8sLaunched indicates how long Kubernetes took to resource and initialised the job." },
{ WHENFIRSTSTAT(K8sReady), "The time when the Thor job is ready to process\nThe difference with K8sStarted inidcates how long it took to resource and start the slave processes" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing period before newline.

{ SIZESTAT(RowMemory), "The size of memory used to store rows" },
{ SIZESTAT(PeakRowMemory), "The peak memory used to store rows" },
{ SIZESTAT(AgentSend), "The size of data sent to the agent from the server" },
{ TIMESTAT(IndexCacheBlocked), "The time spend waiting to access the index page cache" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spend -> spent?

{ CYCLESTAT(IndexCacheBlocked) },
{ TIMESTAT(AgentProcess) },
{ TIMESTAT(AgentProcess), "The total time spend by the agents processing requests" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spend -> spent?

{ SIZESTAT(ContinuationData) },
{ NUMSTAT(ContinuationRequests) },
{ NUMSTAT(AckRetries), "How many times the server fails to receive a response from an agent within the expected time" },
{ SIZESTAT(ContinuationData), "The total size of continuation data sent from agent to the server\nA large number may indicate a poor filter, or merging from many different index locations" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing period before newline.

{ NUMSTAT(ContinuationRequests) },
{ NUMSTAT(AckRetries), "How many times the server fails to receive a response from an agent within the expected time" },
{ SIZESTAT(ContinuationData), "The total size of continuation data sent from agent to the server\nA large number may indicate a poor filter, or merging from many different index locations" },
{ NUMSTAT(ContinuationRequests), "The number of time the agent indicated there was more data to be returned" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time -> times

Copy link
Contributor

@JamesDeFabia JamesDeFabia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments inline. NB: I only reviewed the language in jstats.cpp.

{ SIZESTAT(GeneratedCpp), "The size of the generated c++ file" },
{ SIZESTAT(PeakMemory), "The peak memory used while processing this item" },
{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" },
{ NUMSTAT(RowsProcessed), "How many rows have been processed" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: The number of rows processed

{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" },
{ NUMSTAT(RowsProcessed), "How many rows have been processed" },
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" },
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: The number of times the activity has started executing...

{ NUMSTAT(RowsProcessed), "How many rows have been processed" },
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" },
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" },
{ NUMSTAT(Stops), "How many times the activity has stopped executing.\nAn activity is active if this is less than NumStarts" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: The number of times the activity has stopped executing.

{ NUMSTAT(PreloadCacheHits), UNUSED },
{ NUMSTAT(PreloadCacheAdds), UNUSED },
{ NUMSTAT(ServerCacheHits), UNUSED },
{ NUMSTAT(IndexAccepted), "The number of keyed join matches that return a result from the transform" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keywords s/b All CAPS: KEYED JOIN TRANSFORM

{ NUMSTAT(PreloadCacheAdds), UNUSED },
{ NUMSTAT(ServerCacheHits), UNUSED },
{ NUMSTAT(IndexAccepted), "The number of keyed join matches that return a result from the transform" },
{ NUMSTAT(IndexRejected), "The number of keyed join matches that are skipped by the transform" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keywords s/b All CAPS: KEYED JOIN TRANSFORM

{ SIZESTAT(SpillFile) },
{ NUMSTAT(DiskReads), "The number of disk read operations" },
{ NUMSTAT(DiskWrites), "The number of disk write operations" },
{ NUMSTAT(Spills), "How many times the activity spilt to disk"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: The number of times the activity spilt to disk

{ NUMSTAT(ScansPerRow), UNUSED },
{ NUMSTAT(Allocations), "The number of allocations from the row memory" },
{ NUMSTAT(AllocationScans), "The number of scans within the memory manager when allocating row memory\nOnly applies to the scanning heap manager (not used by default)" },
{ NUMSTAT(DiskRetries), "The number of times an io operation was retried.\nIf this is non zero it may suggest a problem with the underlying disk storage" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io s/b I/O

{ COSTSTAT(Execute) },
{ SIZESTAT(AgentReply) },
{ TIMESTAT(AgentWait) },
{ COSTSTAT(Execute), "Cpu cost of executing" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cpu s/b CPU

{ COSTSTAT(Compile) },
{ TIMESTAT(NodeLoad) },
{ COSTSTAT(FileAccess), "The transactional cost of any file operations" },
{ NUMSTAT(Pods), "How many pods were used" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: The number of pods used.

{ NUMSTAT(AckRetries) },
{ SIZESTAT(ContinuationData) },
{ NUMSTAT(ContinuationRequests) },
{ NUMSTAT(AckRetries), "How many times the server fails to receive a response from an agent within the expected time" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency: The number of times...

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghalliday - looks good.

Some pedantic issues mostly, the present/past-tense of the sentences are a bit inconsistent some of the time - I think it would be better if always past tense.

{ WHENFIRSTSTAT(Created), "The time when an item was created" },
{ WHENFIRSTSTAT(Compiled), "The time a workunit started being compiled" },
{ WHENFIRSTSTAT(WorkunitModified), UNUSED },
{ TIMESTAT(Elapsed), "The elapsed time between starting and finishing\nFor child queries this may be significantly larger than TimeTotalExecute" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing full stop before \n for consistency.

But for ultimate consistency, all single line descriptions should also have a terminating full stop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was aiming for no fullstops since these are likely to appear in tooltips (so I would remove the ones above). @GordonSmith what would you prefer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Primarily consistency, I don't really mind either way.

{ WHENFIRSTSTAT(Compiled), "The time a workunit started being compiled" },
{ WHENFIRSTSTAT(WorkunitModified), UNUSED },
{ TIMESTAT(Elapsed), "The elapsed time between starting and finishing\nFor child queries this may be significantly larger than TimeTotalExecute" },
{ TIMESTAT(LocalExecute), "The time spent executing this activity not including its inputs\nSort activities by local execute time to help isolate potential processing bottlenecks" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing full stop before \n

{ SIZESTAT(MaxRowSize), "The high water mark of the memory used for representing rows (roxiemem)" },
{ NUMSTAT(RowsProcessed), "How many rows have been processed" },
{ NUMSTAT(Slaves), "The number of parallel execution processes used to execute an activity" },
{ NUMSTAT(Starts), "How many times the activity has started executing\nAn activity is active if this does not match NumStops" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An activity is active if this does not match NumStop

not sure worth expanding on, but mismatch might indicate an abnormal termination state vs active.

{ NUMSTAT(IndexScans), "The number of index scans.\nHow many entries are sequentially examined after an initial seek (including wild seeks). Large numbers compared to the number of seeks may indicate extra keyed filters would be worthwhile" },
{ NUMSTAT(IndexWildSeeks), "The number of seeks caused by WILD() filters.\nThe number of keyed lookups that had to search for the next potential match. If this is a high proportion of NumIndexScans it may suggest poor key design" },
{ NUMSTAT(IndexSkips), "The number of smart-stepping operations that increment the next match" },
{ NUMSTAT(IndexNullSkips), "The number of smart-stepping operations that have no effect.\nIf this is large compare to NumIndexSkips it suggests the priority may not be set correctly" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that have

that had?

{ NUMSTAT(PostFiltered), "The number of index matches filtered by the transform and the non-keyed filter" },
{ NUMSTAT(BlobCacheHits), "The number of times a blob is resolved in the cache" },
{ NUMSTAT(LeafCacheHits), "The number of times a leaf node is resolved in the cache" },
{ NUMSTAT(NodeCacheHits), "The number of times a branch node is resolved in the cache" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tense: "is resolved" -> "was resolved"?

NB: below it is past tense : "was read"

{ NUMSTAT(AttribsSimplified), UNUSED },
{ NUMSTAT(AttribsFromCache), UNUSED },
{ NUMSTAT(SmartJoinDegradedToLocal), "The number of times a global smart-join switched to a LOCAL JOIN (with distribute)" },
{ NUMSTAT(SmartJoinSlavesDegradedToStd), "The number of times a global smart-join degraded to a standard join" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it worth saying, after newline. That the above 2 stats will either be 0 or 1, unless in a LOOP?

{ TIMESTAT(AgentWait) },
{ COSTSTAT(Execute), "Cpu cost of executing" },
{ SIZESTAT(AgentReply), "Size of data sent from the workers to the agent" },
{ TIMESTAT(AgentWait), "Time that the agent spends waiting for a reply from the workers" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spends->spent ?

{ CYCLESTAT(NodeLoad) },
{ TIMESTAT(LeafLoad) },
{ TIMESTAT(LeafLoad), "Time spent reading leaf nodes from disk and decompressing them\nIf this is a high proportion of the time (especially compared to TimeLeadRead) then consider using the new index compression formats" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TimeLeadRead -> TimeLeafRead

const char * queryStatisticDescription(StatisticKind kind)
{
StatisticKind rawkind = (StatisticKind)(kind & StKindMask);
if (rawkind >= StKindNone && rawkind < StMax)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result will be same I think either way, but should this be rawKind > StKindNone ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is consistent with the other functions.

Signed-off-by: Gavin Halliday <[email protected]>
Copy link
Member Author

@ghalliday ghalliday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made changes (and a few others). I have removed all periods, but I can add them back if that is preferred.

{ WHENFIRSTSTAT(Created), "The time when an item was created" },
{ WHENFIRSTSTAT(Compiled), "The time a workunit started being compiled" },
{ WHENFIRSTSTAT(WorkunitModified), UNUSED },
{ TIMESTAT(Elapsed), "The elapsed time between starting and finishing\nFor child queries this may be significantly larger than TimeTotalExecute" },
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was aiming for no fullstops since these are likely to appear in tooltips (so I would remove the ones above). @GordonSmith what would you prefer?

const char * queryStatisticDescription(StatisticKind kind)
{
StatisticKind rawkind = (StatisticKind)(kind & StKindMask);
if (rawkind >= StKindNone && rawkind < StMax)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is consistent with the other functions.

Copy link
Contributor

@JamesDeFabia JamesDeFabia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments inline

{ NUMSTAT(IndexWildSeeks), "The number of seeks caused by WILD() filters\nThe number of keyed lookups that had to search for the next potential match. If this is a high proportion of NumIndexScans it may suggest poor key design" },
{ NUMSTAT(IndexSkips), "The number of smart-stepping operations that increment the next match" },
{ NUMSTAT(IndexNullSkips), "The number of smart-stepping operations that had no effect\nIf this is large compare to NumIndexSkips it suggests the priority may not be set correctly" },
{ NUMSTAT(IndexMerges), "The number of merges set up when smart stepping"},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consistency: other instances of smart-stepping are hyphenated

system/jlib/jstats.cpp Show resolved Hide resolved
{ NUMSTAT(IndexRowsRead), "The number of rows read from the index" },
{ NUMSTAT(DiskAccepted), "The number of disk rows that return a result from the TRANSFORM" },
{ NUMSTAT(DiskRejected), "The number of disk rows that are skipped by the TRANSFORM" },
{ TIMESTAT(Soapcall), "The time taken to executing a SOAPCALL" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

executing s/b execute

{ NUMSTAT(ScansPerRow), UNUSED },
{ NUMSTAT(Allocations), "The number of allocations from the row memory" },
{ NUMSTAT(AllocationScans), "The number of scans within the memory manager when allocating row memory\nOnly applies to the scanning heap manager (not used by default)" },
{ NUMSTAT(DiskRetries), "The number of times an I/O operation was retried\nIf this is non zero it may suggest a problem with the underlying disk storage" },
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-zero s/b hyphenated

Signed-off-by: Gavin Halliday <[email protected]>
@ghalliday ghalliday requested review from JamesDeFabia and GordonSmith and removed request for GordonSmith February 21, 2024 10:45
Copy link
Contributor

@JamesDeFabia JamesDeFabia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good from my POV. (I accept smart stepping without a hyphen when not used as an adjective. )

@ghalliday ghalliday merged commit b713bbd into hpcc-systems:candidate-9.4.x Feb 22, 2024
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants