Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RPC] Fetch TPCH profiled info for application at runtime #92

Merged
merged 9 commits into from
Mar 2, 2024

Conversation

dhruvsgarg
Copy link
Collaborator

No description provided.

Copy link
Contributor

@sukritkalra sukritkalra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial pass.

rpc/service.py Outdated
@@ -19,6 +19,7 @@

from schedulers import EDFScheduler, FIFOScheduler
from utils import EventTime, setup_logging
from utils_tpch import get_all_stage_info_for_query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of making this a top-level utils_tpch.py, can you make a folder inside rpc named tpch and name it utils.py inside?

Please also include the *.npy files that we are using for initial testing from Decima.

rpc/service.py Outdated
Comment on lines 388 to 393
default_resource = Resources(resource_vector={
Resource(name="Slot_CPU", _id="any"): 30
}
)
default_runtime = EventTime(20, EventTime.Unit.US)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you using this in the code below?

rpc/service.py Outdated
for task_dependency in request.dependencies:
framework_task = task_dependency.key
if is_tpch_query:
task_slots = tpch_query_all_stage_info[framework_task.id]["num_tasks"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain how you're matching the correct stage with its actual resource usage and runtime here?

utils_tpch.py Outdated
Comment on lines 3 to 4
import numpy as np
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build is failing, please look at the errors and resolve them.

Copy link
Contributor

@sukritkalra sukritkalra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM since tested.

Minor comments

default_resource = Resources(
resource_vector={Resource(name="Slot_CPU", _id="any"): 30}
)
default_runtime = EventTime(20, EventTime.Unit.US)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_runtime = EventTime(20, EventTime.Unit.US)
default_runtime = EventTime(20, EventTime.Unit.S)

Do you have any runtimes that are lower than a second?

def get_all_stage_info_for_query(query_num):
task_durations = np.load(
os.path.join(
HOME_TPCH_DIR, TPCH_SUBDIR, "task_duration_" + str(query_num) + ".npy"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we get the TPCH_SUBDIR label from the application name?

We might want to run a mix of TPC-H queries from different scaling sizes.

return False, None


def verify_and_relable_tpch_app_graph(query_num, dependencies):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def verify_and_relable_tpch_app_graph(query_num, dependencies):
def verify_and_relabel_tpch_app_graph(query_num, dependencies):

Typo

@sukritkalra sukritkalra merged commit 388f39f into main Mar 2, 2024
1 check passed
@sukritkalra sukritkalra deleted the dg/tpch_spark_workload branch March 2, 2024 00:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants