Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort Qual execs report by sqlId and nodeId #1436

Merged
merged 2 commits into from
Nov 27, 2024

Conversation

amahussein
Copy link
Collaborator

Fixes #1435

This pull request is to make the execs report sorted by SqlID and nodeID.
Overall, it is more readable and easier to troubleshoot the file by looking at the rows grouped by the SQLID next to each other.
Also, this implementation is more optimized because it avoids creating a list of tuples for each row. Instead, it create a string sequence from the execInfo and convert it to string.

This pull request includes several changes to the QualOutputWriter class and related methods to simplify and improve the code for writing execution reports. The most important changes include removing redundant methods, simplifying method signatures, and refactoring the logic for constructing CSV rows.

Simplification and refactoring:

Method signature updates:

Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
@amahussein amahussein added the core_tools Scope the core module (scala) label Nov 26, 2024
@amahussein amahussein self-assigned this Nov 26, 2024
Signed-off-by: Ahmed Hussein (amahussein) <[email protected]>
Copy link
Collaborator

@nartal1 nartal1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @amahussein !

Copy link
Collaborator

@parthosa parthosa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @amahussein. A minor question.

StringUtils.reformatCSVString(info.stages.mkString(":")),
childrenExecsStr,
nodeIdsStr,
if (info.shouldRemove) booleanTrue else booleanFalse,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why do we require the booleanTrue and booleanFalse variables instead of directly using info.shouldRemove.toString?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question @parthosa !
I assume that info.shouldRemove.toString is going to create a new thread each time it is called. We can actually test that by checking the address of the string returned from the call in case Scala is optimized to return a string from an internal pool.
Using the booleanTrue allocates the object only once and shares it with all other records.

Copy link
Collaborator Author

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @parthosa and @nartal1

StringUtils.reformatCSVString(info.stages.mkString(":")),
childrenExecsStr,
nodeIdsStr,
if (info.shouldRemove) booleanTrue else booleanFalse,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question @parthosa !
I assume that info.shouldRemove.toString is going to create a new thread each time it is called. We can actually test that by checking the address of the string returned from the call in case Scala is optimized to return a string from an internal pool.
Using the booleanTrue allocates the object only once and shares it with all other records.

@amahussein amahussein merged commit 5f10bbd into NVIDIA:dev Nov 27, 2024
14 checks passed
@amahussein amahussein deleted the rapids-tools-1435 branch November 27, 2024 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEA] Make Execs report in Qualification more readable by sorting
3 participants