Uses of `PreHashedMap` are probably not sound #6373

jimblandy · 2024-10-06T16:43:36Z

The PreHashedMap type in wgpu_core is probably not used in a sound way, in particular, its use in Device::derive_pipeline_layout is almost certainly incorrect.

The idea is that a PreHashedMap key is just a copy of the true key's u64 hash value. But this doesn't provide any way to distinguish keys that have the same hash value: PartialEq on a PreHashedKey just compares the cached hash values. This will lead to incorrect lookups and insertions.

(I don't think PreHashedMap makes much sense: it's only sound if you can guarantee a distinct hash value for distinct keys, which means these "hash values" are almost certainly not produced by hashing: you're using some identifier or pointer that was assigned in a way that guarantees uniqueness. In that case, you should simply use that identifier as the key directly.)

The text was updated successfully, but these errors were encountered:

jimblandy · 2024-10-06T16:45:41Z

Aside: Managing pools of de-duplicated complex values is really tricky, even if you're not playing games with hash values: my friend recently found a year-old race condition in similar code he'd written for wasmtime that led to a CVE.

jimblandy · 2024-10-06T16:53:19Z

From chat, @cwfitzgerald says:

We can probably mitigate this by storing the key and the hash, then dispatching eq to the real key, and hash to the precomputed hash

I infer from this that the motivation was to avoid the overhead of re-computing hash values. If we're doing a lot of queries with the same key, and we can keep the PreHashedKey around to reuse for all the queries, then this makes sense.

But it's my understanding that Rust hash tables themselves cache the hash values of the keys already stored in the table. A lookup computes the hash of the key you're querying with, but then it just checks that against the hashes stored in the table, and performs a real PartialEq comparison only when the hashes match. So storing PreHashedKey in the table doesn't make sense; the table already has the hash value cached.

jimblandy · 2024-10-06T17:03:26Z

Hmm, I'm reluctant to dive into the hashbrown code, but from the description it seems like they only store one byte of the hash value for each key. But I believe it is the case that they are not re-hashing stored keys at lookup time, which means that storing PreHashedKey in the table isn't avoiding hashes.

cwfitzgerald · 2024-10-06T20:40:51Z

Yeah I dove into it a bit too, and yeah I think we should just pull the whole abstraction - I was trying to avoid needing to do large structure comparisons inside a lock, but I don't think that can be reasonably avoided.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uses of `PreHashedMap` are probably not sound #6373

Uses of `PreHashedMap` are probably not sound #6373

jimblandy commented Oct 6, 2024

jimblandy commented Oct 6, 2024

jimblandy commented Oct 6, 2024

jimblandy commented Oct 6, 2024

cwfitzgerald commented Oct 6, 2024

Uses of PreHashedMap are probably not sound #6373

Uses of PreHashedMap are probably not sound #6373

Comments

jimblandy commented Oct 6, 2024

jimblandy commented Oct 6, 2024

jimblandy commented Oct 6, 2024

jimblandy commented Oct 6, 2024

cwfitzgerald commented Oct 6, 2024

Uses of `PreHashedMap` are probably not sound #6373

Uses of `PreHashedMap` are probably not sound #6373