You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The new trusted signals KVv2 protocol is a good opportunity for us to reconsider how we handle errors in the response we received over-the-wire.
There are a number of classes of errors, which we may or may not want to handle differently. Note that while I'm trying to group errors in sets that I think should probably be handled similarly, other groupings are possible.
Global errors (network errors, encryption errors, can't parse outer CBOR, entire structure of response wrong, so can't find any compression groups), framing header errors. In these cases, we have absolutely no usable data at all.
Errors with specific compression groups in the outer CBOR (two compression groups with a duplicate ID, groups without an ID, group that's not even a map, group missing other necessary fields ("content" is the only one currently), "content" not a binary string.
Error that renders an entire compression group's "content" unusable (decompression fails, can't parse as CBOR, can't find partitions array).
Error in partitions array (partition not a map, partition without an ID, partitions with duplicate ID, partition without "keyGroupOutputs" array).
Error within a partition's keyGroupOutputs CBOR (Entry not an object, tags/keyValue entry missing, keyValues entry missing a "value" field, "value" field not a string, keyValues entry not a map, tags value not an array, tags array has multiple values).
"values" entry in "keyGroupOutputs" not JSON.
Errors within the "values" JSON (e.g., priorityVector not a dictionary, value within priorityVector wrong type, "updateIfOlderThanMs" not a number.
Missing compression groups / partitions (I think receiving ones that weren't requested can just be ignored).
For 1)-5), the bad data had to come from the TEE code that wraps the bidder/seller response generation code, as opposed to the bidder/seller code itself, so we can probably just do whatever failure behavior is convenient (fail all fetches, fail only affected fetches, etc), and just treat the same as a network failure.
On the other hand, 6) and 7) deal with data provided by the buyer/seller script, rather than the shared top-level TEE code (not sure if there's better terminology here). Note that each value corresponds to a single key or data about a single interest group. Discussions around what to do here are the reason for filing this issue. If we hard-fail on bad data, folks are more likely to notice bugs and fix their code, but that also means potentially breaking auctions in the case of any bad data received (B&A aborts bidding/scoring on KV fetch failure, chrome does not, currently - which is the right choice is yet another issue, which seems like a good idea to discuss, albeit on a different issue). Back to the issue at hand: Hard failing the entire response or ignoring errors in the JSON and keeping whatever we can get out of it each have their own advantages.
The current behavior in Chrome's not-yet-landed code is for 6) is to treat it as a fatal error for the entire partition (or the entire response), and to ignore 7), extracting whatever data we can successfully read from the JSON string. That seems inconsistent. 7)'s permissive behavior is inherited from shared code for KVv1. If we hard fail, there's also the question of what we hard fail: Just the requests that try to use the bad data, all requests, all requests using the partition, all requests using the compression group. Only failing the requests using the bad value opens up options for lazily parsing values, but also means any cached parsed JSON would need to record if that particular chunk of JSON failed to parse with an error.
Failures of type 8) could be grouped with 1)-5). It's probably best to just fail individually any requests with missing data. Probably more an issue to discuss with the server team and how their code works (and whether the choice matters).
The text was updated successfully, but these errors were encountered:
MattMenke2
changed the title
Spec out trusted key value V2 error handling
Decide on how to handle errors in trusted key value V2 responses
Sep 24, 2024
MattMenke2
changed the title
Decide on how to handle errors in trusted key value V2 responses
Decide how to handle errors in trusted key value V2 responses
Sep 24, 2024
The new trusted signals KVv2 protocol is a good opportunity for us to reconsider how we handle errors in the response we received over-the-wire.
There are a number of classes of errors, which we may or may not want to handle differently. Note that while I'm trying to group errors in sets that I think should probably be handled similarly, other groupings are possible.
For 1)-5), the bad data had to come from the TEE code that wraps the bidder/seller response generation code, as opposed to the bidder/seller code itself, so we can probably just do whatever failure behavior is convenient (fail all fetches, fail only affected fetches, etc), and just treat the same as a network failure.
On the other hand, 6) and 7) deal with data provided by the buyer/seller script, rather than the shared top-level TEE code (not sure if there's better terminology here). Note that each value corresponds to a single key or data about a single interest group. Discussions around what to do here are the reason for filing this issue. If we hard-fail on bad data, folks are more likely to notice bugs and fix their code, but that also means potentially breaking auctions in the case of any bad data received (B&A aborts bidding/scoring on KV fetch failure, chrome does not, currently - which is the right choice is yet another issue, which seems like a good idea to discuss, albeit on a different issue). Back to the issue at hand: Hard failing the entire response or ignoring errors in the JSON and keeping whatever we can get out of it each have their own advantages.
The current behavior in Chrome's not-yet-landed code is for 6) is to treat it as a fatal error for the entire partition (or the entire response), and to ignore 7), extracting whatever data we can successfully read from the JSON string. That seems inconsistent. 7)'s permissive behavior is inherited from shared code for KVv1. If we hard fail, there's also the question of what we hard fail: Just the requests that try to use the bad data, all requests, all requests using the partition, all requests using the compression group. Only failing the requests using the bad value opens up options for lazily parsing values, but also means any cached parsed JSON would need to record if that particular chunk of JSON failed to parse with an error.
Failures of type 8) could be grouped with 1)-5). It's probably best to just fail individually any requests with missing data. Probably more an issue to discuss with the server team and how their code works (and whether the choice matters).
The text was updated successfully, but these errors were encountered: