Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] create_custom_training_job_from_component adds an unexpected default location #11283

Open
VaclavMacha opened this issue Oct 9, 2024 · 0 comments

Comments

@VaclavMacha
Copy link

Environment

  • How did you deploy Kubeflow Pipelines (KFP)? VertexAI Pipeline

  • KFP version: I don't know which version is used in VertexAI

  • KFP SDK version: 2.7.0 (and google-cloud-pipeline-components 2.17.0)

Steps to reproduce

When creating custom training jobs from components using create_custom_training_job_from_component it is not possible to change the default project and location in which the job should be executed.

from google_cloud_pipeline_components.v1.custom_job import create_custom_training_job_from_component
from kfp import dsl


# Create a Python component
@dsl.component()
def dummy_component():
    return

# Convert the above component into a custom training job
dummy_job = create_custom_training_job_from_component(
    dummy_component,
)

# Define a pipeline that runs the custom training job
@dsl.pipeline(
    name="Dummy pipeline",
)
def dummy_pipeline():
    # this task will be executed in the default location which is us-central1 
    task_1 = dummy_job().set_display_name("Task 1")

    # this task will be executed in europe-west1
    task_2 = dummy_job(
        location="europe-west1",
    ).set_display_name("Task 2")

The example above shows that if the location is not provided during task creation from dummy_job, the default location (us-central1) is used. However, this behavior is unintuitive since the location input argument is added to the dummy_component in the create_custom_training_job_from_component function. Therefore, it is not apparent from the code that using the location input argument is possible. Moreover, executing all tasks in the same location is standard practice. Therefore, it makes sense to set it once when creating a custom training job from the component and not individually for each task in the pipeline.

The same behavior also applies to the project. However, in this case, the default value is PROJECT_ID_PLACEHOLDER, and the exact value is set to correct project during run time.

Expected result

I would expect create_custom_training_job_from_component function to have location and project input arguments to be able to override default values. I would also expect that the default location is not a "hard coded" value (us-central1), but LOCATION_PLACEHOLDER.

from google_cloud_pipeline_components.v1.custom_job import create_custom_training_job_from_component
from kfp import dsl


# Create a Python component
@dsl.component()
def dummy_component():
    return

# Convert the above component into a custom training job
dummy_job = create_custom_training_job_from_component(
    dummy_component,
    project="dummy_project"
    location="europe-west1",
)

# Define a pipeline that runs the custom training job
@dsl.pipeline(
    name="Dummy pipeline",
)
def dummy_pipeline():
    # Both tasks will be executed in europe-west1
    task_1 = dummy_job().set_display_name("Task 1")

    task_2 = dummy_job(
        location="europe-west1",
    ).set_display_name("Task 2")

Impacted by this bug? Give it a 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants