-
Notifications
You must be signed in to change notification settings - Fork 488
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Computation Hash] User Computation hash is agnostic to debug metadata #8538
Comments
Is there any reason someone would want the old behavior? Sounds like we should just fix |
If someone somehow relies on the metadata for any functional property. However, I am not aware of any such functionality, especially with the attributes that are currently in it, however, there are several: https://github.com/openxla/xla/blob/a49d854a355374453af37a266823450db6714d9a/xla/xla_data.proto#L385. Some are quite ambiguous: https://github.com/openxla/xla/blob/a49d854a355374453af37a266823450db6714d9a/xla/xla_data.proto#L445 The earlier we make such a decision, the better, but the flag would allow us to revert this behavior for any customer that do rely on these. We can then push for XLA to add proper entries outside of the metadata if needed for any functionality. It's a bit unclear there. We can also remove a subset of the attrs, which some backend compile engines do, e.g. |
Let me double check with some XLA folks |
I think the question I wanted to get an answer is, does any of these fields affect the semantics of the resulting compiled executable in any way. |
🚀 Feature
Similar to #8537:
and in this case, different entries from IR debug will end up generating different unique hashes. Subsequently, the hash will require a recompilation due to a Torch JIT cache miss.
In this feature request, we discuss and consider making a design decision to make the cache agnostic to metadata that should not influence the execution of the computations.
Motivation
The motivation is that we want to avoid having to recompile in case there is metadata in the proto, which otherwise affects performance and possibly OOMs (some backend engines treat the resulting HLO from TorchXLA as an unique executable binary).
Pitch
We could consider introducing a new flag that changes that behavior, but make it an opt in (or opt-out):
Say,
XLA_CACHE_IGNORE_METADATA
that removes some entries in the metadata field of the HLO proto module, e.g.:op_name
source_line
source_file
before generating the hash key for caching, but retaining otherwise. See all entries in https://github.com/openxla/xla/blob/a49d854a355374453af37a266823450db6714d9a/xla/xla_data.proto#L385. This would be on any place where the resulting computed protobuf for caching (includes user computation) is generated. This should be extensible in case other entries fit the profile here as well. Hence, at least, TorchXLA's "debug" metadata entry should, by design, not interfere with the resulting computation - unlike explicit metadata like frontend_attributes, rhs_contracting_dims, etc, e.g.:
The text was updated successfully, but these errors were encountered: