Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework hlsl-vector-type into two specs #361
base: main
Are you sure you want to change the base?
Rework hlsl-vector-type into two specs #361
Changes from 8 commits
c0239b5
343f6b3
7ffe4d4
edcb7e1
e7b1442
35dccde
3f8c22c
6a23347
e932fad
bbeff68
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the range is inclusive, then that 4 should be 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this design accommodate
N
being supplied from a specialization constant supplied at runtime, rather than a literal value that is known at HLSL compile time?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Runtime specialization constants are not currently supported in HLSL. That's a question we'll need to answer when it is. For now, any similar mechanism produces an error: https://godbolt.org/z/5Kszf9Tr3 This includes the existing vk:: namespace specialization/push constant support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somewhat related to what @tex3d asked, but more broad: the spec needs to say how N can be written.
N
where earlier there is a declarationconst uint N = 15;
N
whereN
is the parameter to a function, and the vector type declaration is inside the function.I guess it has to be statically computed. (I'm not sure what the HLSL term for it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this introduces no changes to the existing vector declaration restrictions, I didn't think it necessary to call out that it needs to be a constant expression. Looking at the docs though, I see that it isn't mentioned there, so I'll update the docs and call it out here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this line needed for the long vector spec? It's probably implicitly assumed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which part do you take to be implicitly assumed? I don't think support across all shader stages should be assumed. The control/wave uniformity requirements touch more on the followups and could probably be removed. I'm inclined to leave the implementations encouraging best practices language as some platforms might end up scalarizing vector operations at some point or another which won't lead to the best performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to have a way to initialize all elements of the vector with one value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current way of doing that is assigning a scalar to a vector. This mentions scalars. Did you have something else in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if not all elements are listed explicitly. Are the rest zero-initialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The existing behavior where an error is produced for insufficient elements is unchanged, though poorly documented. I don't mind adding a note here to that effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we discuss alignment requirements at all here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any suggestions on how that discussion might look?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could say that a long vector <T,N> has the same alignment requirements as an array of N elements of type T.
Except that may be different from vector<float,4> which might have 16byte alignment. With scalar alignment they would be the same. (Sorry, I don't remember if modern HLSL assumes scalar alignment for vectors.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tex and I will discuss this and I'll update the spec accordingly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alignment for vectors in HLSL is purely based on the alignment of the scalar element type. Constant buffer layout ("legacy") is a special case where a vector that would cross a 16-byte aligned boundary is started at the next 16-byte aligned location instead.
Array elements have the same natural alignment based on scalar element type as vectors. However, they also have a special rule for constant buffer layout ("legacy") where each array element must start at the beginning of a 16-byte aligned boundary. There are also special rules for signature inputs/outputs for arrays, but these can't be described by a concept of alignment.
The "legacy" constant layout rules cannot be described by the concept of type alignments.
Changing the natural alignment rules for a vector depending on whether it has > 4 elements seems weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the space between
> >
necessary, as per C++03 style parsing? Might be good to call out.I hope no space is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HLSL 2021 templates are C++98/03-era, so the space is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm willing to add comments regarding vectors specifically even if there is no change in previous behavior, but as this is just the facts of HLSL templates (and pseudo-templates if you prefer), I feel this is outside the scope of this document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While defining the HLSL language, I do not understand the need to link to the DXIL proposal as if it defines what's meant by "elementwise calculations" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't consider this a language spec. I structured it as a shader model spec.
There's no "as if" about it. I defined what elementwise calculations are in the dxil vectors spec. I don't really care where that definition ends up, but it's useful to have because there isn't anywhere else. Since I wrote this spec to depend on that one, it made sense to put it there. As it stands, neither spec technically depends on the other because long vectors can be implemented with scalarization and native vectors can be used without changing the allowed vector length. The lack of any hard interdependence inclines me further toward leaving this as having a DXIL element as well as language elements even if it's just discussion of validation changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This section should be informative.
A good SPIR-V representation could be as arrays of elements. So it's still a single SSA value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SPIR-V representation is very much up in the air right now. I prefer to leave this vague. This is mentioned in the issues below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think DXIL validation should be left to the DXIL spec, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not find it useful to divide the specs along interchange format and language lines. The DXIL spec would have included elements of native vectors and also of long vectors while the language spec would have contained only part of the long vectors description. As such, I created two shader model features that can, but don't have to include language changes. This one includes language and DXIL changes. "The DXIL spec" only has DXIL changes, but its characteristic feature is that it documents native DXIL vectors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the interchange format section above we have:
This seems to imply that long vectors can be supported in existing shader models. I think it's only native DXIL vectors feature that actually requires SM 6.9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That wasn't what I intended to imply with that statement, rather that implementations could choose either approach as we intend to do at least temporarily. However, it is true that this is possible. One of the major remaining questions is if we should.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I resolve this conversation: do we have something written down that is tracking this remaining question?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've created #371
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's where I think we describe testing for each supported backend IR. Shouldn't we be giving SPIR-V some love here as well?
Shouldn't there be a section before this that outlines various valid scenarios for AST testing?
For a given implementation, perhaps there would be an additional infra spec to outline tests for initial codegen and various important phases through the compiler as well, right? Perhaps the AST test scenarios belong there too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely. I haven't done that research just yet. I've added a slightly hand-wavey allusion to the SPIR-V equivalent.
Have we done this in the past? I'm not sure what scenarios you have in mind. I'd like to see an example to better understand.
I think infra specs are useful to discuss forthcoming implementations. Given the state of this implementation, I don't think writing one would be as productive as carefully documenting what has been done in code and commit comments. I think that would be more likely to be preserved and found by future generations of coders.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think DXIL validation belongs in the DXIL spec only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this section belongs in the DXIL spec exclusively, even though in practice most of our execution tests in DXC start with HLSL when testing DXIL backends.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think execution testing may be the strongest argument for keeping it here. Long and native vectors aren't really interdependent and although the tests would likely share a lot of code, they should be independently testable in execution testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this mean that long vectors can marked as groupshared and it just works? Wondering if this Q&A belongs in the DXIL spec?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
groupshared is allowed. I don't list it as being disallowed, but I don't mind calling it out explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing the context or the significance of not including explicit Load/Store operations here, or just plain misunderstanding this Q/A altogether. The "preferred solution is outside the scope" seems to suggest that there's not going to be a way to use long vectors in groupshared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a feature that is directly connected with long nor native vectors. The core problem is that groupshared memory is limited and explicitly allocating it to one purpose is expensive. Instead, many users want to reuse groupshared memory for a few different purposes depending on stage and time.
There's some disagreement on how we can best solve this. We might provide what is essentially a groupshared rawbuffer with mechanisms to perform loads and stores on it similarly to how we do on global rawbuffers. That was the alternative approach previously described. There are more clever compiler things (tm) that could be done or other ways of enabling the user in this way.
Since it is not a problem directly connected with this, since there are a few different approaches we might take that would take a long time to decide on and implement, and since there are other solutions to the problem using existing mechanisms, we deemed it out of scope for this project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I think that this Q/A is a bit confusing as written now. If I understand correctly, it's saying that we're not going to tackle solving the problem of people wanting buffer-like access to groupshared memory. So in the same way that we don't support explicit Load/Store operations for other data types, we're not going to add new ones for this. Marking a long vector as groupshared is still expected to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you've said is accurate, so I don't know why you think it's confusing. I'm happy to take another run at it, but I think it's succinct and sufficient. It doesn't go into detail because I don't think we need to. It's a feature for another day. I could create an issue for it that could contain all the details if that would help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was only able to reach the understanding by this conversation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's useful to go into detail about something that someone wanted to tack onto this spec, but isn't related to it. Perhaps the best approach is to just remove it.