Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PaRSEC now allows DSLs to free the gpu task #307

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

devreal
Copy link
Contributor

@devreal devreal commented Nov 17, 2024

We can allocate the GPU task inside the task structure and avoid an extra allocation.

@devreal devreal requested a review from therault November 17, 2024 02:04
@devreal
Copy link
Contributor Author

devreal commented Nov 20, 2024

This doesn't work as thought. PaRSEC releases the task containing the gpu_task structure before the gpu_task is released so we end up with fields overwritten prematurely.

@devreal devreal marked this pull request as draft November 20, 2024 01:13
@devreal devreal marked this pull request as ready for review November 25, 2024 14:46
@devreal
Copy link
Contributor Author

devreal commented Nov 25, 2024

I take back what I said earlier. The error was somewhere else and not in this PR. Ready for review.

@@ -4,7 +4,7 @@
set(TTG_TRACKED_VG_CMAKE_KIT_TAG d1b34157c349cf0a7c2f149b7704a682d53f6486) # provides FindOrFetchLinalgPP and "real" FindOrFetchBoost
set(TTG_TRACKED_CATCH2_VERSION 3.5.0)
set(TTG_TRACKED_MADNESS_TAG 93a9a5cec2a8fa87fba3afe8056607e6062a9058)
set(TTG_TRACKED_PARSEC_TAG 58f8f3089ecad2e8ee50e80a9586e05ce8873b1c)
set(TTG_TRACKED_PARSEC_TAG a9ab33d8287578c68c0349662352f280bc83e2c0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the 4.0 please

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tc.out[i] = gpu_task->flow[i];
/* set up the device task */
parsec_gpu_task_t *gpu_task = task->dev_ptr->gpu_task;
/* TODO: needed? */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should not need this, you construct the list_item and then set the rest of the gpu_task fields to default values.

parsec_task_class_t& tc = task->dev_ptr->task_class;

// input flows are set up during register_device_memory as part of the first invocation above
for (int i = 0; i < MAX_PARAM_COUNT; ++i) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the upper bound here always MAX_PARAM_COUNT ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't know how many device inputs the application will give us. We could put a stop there but the impact will be marginal.

We can allocate the GPU task inside the task structure and avoid
an extra allocation.

Signed-off-by: Joseph Schuchart <[email protected]>
@devreal devreal force-pushed the parsec-gpu-task-free branch from f6c8441 to 2c1323a Compare December 23, 2024 16:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants