Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query out and use local size set in program IL in CL adapter. #2160

Merged
merged 1 commit into from
Oct 24, 2024

Commits on Oct 22, 2024

  1. Query out and use local size set in program IL in CL adapter.

    The CL spec wording on this is kind of fuzzy but every CL driver I
    tested (across intel, amd, nvidia cpu + gpu) returns an error when you
    have a local size set in the program source/IL and you don't specify any
    local size in your clEnqueueNDRangeKernel call (i.e. you leave it as
    NULL).
    
    Our spec does allow you to leave local size as null if you have a size
    specified in your program, so this change adds some logic to query out
    the size set in the program and passes it to the enqueue call.
    
    Initially I was concerned this might impact performance of current users
    but it looks like SYCL always passes a local size when calling
    urEnqueueKernelLaunch so it won't hit the path with the extra query.
    aarongreig committed Oct 22, 2024
    Configuration menu
    Copy the full SHA
    a97df49 View commit details
    Browse the repository at this point in the history