Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Arguments for Distributed Mode in Qualification Tool CLI #1429

Open
wants to merge 9 commits into
base: spark-rapids-tools-distributed-base
Choose a base branch
from

Conversation

parthosa
Copy link
Collaborator

@parthosa parthosa commented Nov 18, 2024

Fixes #1430.

This PR adds the initial changes needed in CLI to support distributed execution in the Qualification Tool CLI. It adds arguments to enable distributed mode and sets the stage for future implementation PRs.

Note:

  • An environment setup document will be shared internally.

Changes Overview

  • Extended RapidsJob: Introduced two subclasses—RapidsDistributedJob and RapidsLocalJob and a concrete class for the OnPrem platform.
  • Created a JarCmdArgs class to encapsulate all arguments needed to construct the JAR command.
  • Implemented the DistributedToolsConfig class, allowing configurations for distributed tools (like Spark properties) to be specified via the existing --tools_config_file option.

CMD:

spark_rapids qualification --platform onprem --eventlogs /path/to/eventlogs  --verbose --filter_apps all \
 --distributed --tools_config_file /path/to/custom_conf_file.yaml

Sample Config File:

api_version: '1.0'
distributed_tools:
  spark_properties:
    - name: 'spark.executor.memory'
      value: '20g'

Details:

Enhancements to argument processing:

Platform class updates:

  • user_tools/src/spark_rapids_pytools/cloud_api/databricks_aws.py, databricks_azure.py, dataproc.py, dataproc_gke.py, emr.py: Disabled pylint warnings for abstract methods. [1] [2] [3] [4] [5]

Other improvements:

Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
Signed-off-by: Partho Sarthi <[email protected]>
@parthosa parthosa added feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python) labels Nov 18, 2024
@parthosa parthosa self-assigned this Nov 18, 2024
@parthosa parthosa marked this pull request as ready for review November 18, 2024 21:47
@@ -608,7 +609,7 @@ def populate_dependency_list() -> List[RuntimeDependency]:
# check if the dependencies is defined in a config file
config_obj = self.get_tools_config_obj()
if config_obj is not None:
if config_obj.runtime.dependencies:
if config_obj.runtime and config_obj.runtime.dependencies:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since runtime field in the tools config has been made optional, we need to check if config_obj.runtime is not None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regenerate the specification file for --tools_config_file as we are introducing a new property distributed_tools_config

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sample config file that defines spark properties for the distributed mode.

Copy link
Collaborator

@cindyyuanjiang cindyyuanjiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @parthosa! LGTM, just a few quick questions.

Signed-off-by: Partho Sarthi <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request user_tools Scope the wrapper module running CSP, QualX, and reports (python)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants