-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the UUID generator. #4082
Conversation
This pull request has been linked to Shortcut Story #28607: Use a common UUID generation code across platforms.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to simplify the UUID generation by just formatting random bytes. There's no need to move randomness through a UUID struct for legacy reasons that don't apply to us.
If we want to be pedantic and call these UUIDv4, we can set the 6 bits or whatever though technically this makes things more likely to collide (by an imperceptibly minuscule amount) so no one does to my knowledge. Not setting those bits we can just call them UUIDs without a specific version.
namespace tiledb { | ||
namespace sm { | ||
namespace uuid { | ||
struct Uuid { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole UUID struct seems fairly overboard for generating a UUID. I'd simplify it to just two functions for generating bytes and formatting them as hex. Since we've put this in the uuid
namespace, calling code would just need to use auto uuid = uuid::generate();
or similar. Might be you'll want to move the rand_bytes function to some other place but I don't think we have any custom rand modules yet so unless you know of one I wouldn't bother creating it.
struct Uuid { | |
#ifdef _WIN32 | |
void rand_bytes(std::string& bytes, size_t num_bytes) { | |
bytes.resize(num_bytes); | |
int rc = BCryptGenRandom( | |
BCRYPT_RNG_ALG_HANDLE, | |
reinterpret_cast<uint8_t*>(bytes.data()), | |
num_bytes, | |
0); | |
if (!NT_SUCCESS(rc)) { | |
throw std::runtime_error("Cannot generate GUID: BCryptGenRandom failed."); | |
} | |
} | |
#else | |
void rand_bytes(std::string& bytes, size_t num_bytes) { | |
bytes.resize(num_bytes); | |
int rc = RAND_bytes(reinterpret_cast<uint8_t*>(bytes.data()), num_bytes); | |
if (rc < 1) { | |
char err_msg[256]; | |
ERR_error_string_n(ERR_get_error(), err_msg, sizeof(err_msg)); | |
throw std::runtime_error( | |
"Cannot generate random bytes with OpenSSL: " + std::string(err_msg)); | |
} | |
} | |
#endif | |
std::string generate() { | |
std::string bytes(16); | |
rand_bytes(bytes, 16); | |
std::string uuid(32); | |
for(size_t i = 0; i < bytes.size(); i++) { | |
snprintf(uuid.data() + (i * 2), 2, "%02x", (uint8_t) bytes[i]); | |
} | |
return uuid; | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I removed the hyphenation parameter completely. We don't use it anywhere except tests so why bother keeping it around? If we need it in the future its not like it wouldn't be trivial to add. Also, I've never seen hyphenated versions outside of very legacy uses as those hyphens are dead weight in terms of storage/transmission costs and since random UUIDs aren't parsed there's not reason to pretend there's structure to the value.
@@ -38,11 +38,9 @@ namespace tiledb::storage_format { | |||
|
|||
std::string generate_uri( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is purely a comment that you can feel free to ignore, but given that we're already planning on rerouting all the places that generate URIs through this function it might be beneficial to do that as part of this PR since we're already modifying the places that would be calling into it. From my spot checking you'll need to add a version that doesn't append the version on but that's pretty trivial.
I didn't look hard enough, but the other thing is a version that takes a std::pair<uint64_t, uint64_t>
and proxies through to the existing one might be helpful but might be not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've been asked before to unify these calls, will do as part of this PR.
Closing this. It will be done in https://app.shortcut.com/tiledb-inc/story/27545/implement-single-process-fragment-isolation-strategy by changing the fragment format to avoid the use of UUIDs. |
SC-28607
This PR:
uuid::generate_uuid
, making it return a string and simplifying its callsites.RAND_Bytes
is thread-safe since OpenSSL 1.1.TODO
fmt
instead ofsnprintf
.Crypto::get_random_bytes
after Improve cryptography tests. #3979.TYPE: IMPROVEMENT
DESC: Refactor the UUID generator and use a single implementation across platforms.