-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use C code from pgsql-hackers #15
base: main
Are you sure you want to change the base?
Conversation
This code have several advantages: 1. Implement counter per standart draft. This counter protects from collisions better than random numbers, when UUIDs are generated at high speed. 2. Buffer randomness. This greatly improves speed of generation. Try select uuid_generate_v7() from generate_series(1,1E6); 3. Avoid pghton(), because it does several unnecesary manipulations. 4. Avoid initializing timestamp part with randomness. This helps to save some entropy. Make the world greener :) The code is based on PG patch https://commitfest.postgresql.org/43/4388/ I have my repo https://github.com/x4m/pg_uuid_next , but I do not have energy to support it like you do. So I decided to bring some code from there to your implementation. Have a nice day :)
@x4m , thanks for your PR! I also read your thread in the A few thoughts:
All in all, I'm unlikely to add 1 or 3. If I benchmark the cache in 2 and it provides a large performance improvement, I may implement something similar. 4 seems reasonable if I can verify and test there aren't any off-by-one errors that would lead to buffer overflows. Cheers. |
|
I'm still undecided on implementing something similar to 2. After several benchmarks the performance gain is about 10% on a Postgres install with OpenSSL. This is pretty good, but I still feel like the functionality probably shouldn't live in this extension. I'm going to continue considering it, however. |
@fboulnois |
At this point, the only substantive difference between x4m's patch and my extension is the alternate pseudorandom construction, which I've already discussed in #15 (comment) . Cheers! |
I think @sergeyprokhorenko meant exactly this: let's add a counter or microseconds. |
I meant to add a counter 18 bits long, initialized every millisecond with a random number. The most significant bit of the counter is initialized to zero to prevent the counter from overflowing if the most significant bits of the random number are ones.
This promotes sortability, additional monotonicity and better database performance. It can also make it easier to find bugs by knowing the latest version of the record. |
Different projects use different counter lengths. Why use 18? I see other projects use 14, 16, 24, 40, 42, and 48 bits for the counter size. |
Just a sane number: 125MHz without overflow, tradeoff between max frequency and predictability. Detailed explanation was in hackers AFAIR. |
Too short a counter makes it possible for the counter to overflow under peak loads. This can be mitigated by using the timestamp as the counter in such a situation, but is intuitively undesirable. But a short counter theoretically allows for faster UUID detection if the UUID is compared over the timestamp + counter segment (this is not the case now) rather than over the entire UUID length. With 18 bits, counter overflow is guaranteed to be impossible if the high-order bit of the counter is initialized to zero every millisecond, and if the remaining counter bits are initialized to a random number. Too long a counter reduces the random segment of the UUID, and therefore worsen unguessability of the UUID by brute-force guessing. But a long counter theoretically allows for fewer resources to be spent on random number generation. RFC 9562 recommends that the counter SHOULD be at least 12 bits but no longer than 42 bits. |
UUIDv7 generation function in ClickHouse DBMS https://clickhouse.com/docs/en/sql-reference/functions/uuid-functions#generateUUIDv7 For the sake of monotonicity, when several microservices access the UUIDv7 generator in parallel, a single UUIDv7 generator is used per server. |
See https://www.postgresql.org/message-id/84D20D0F-B5FF-41CD-9F48-E282CE9FEC1D%40yandex-team.ru and https://www.postgresql.org/message-id/F91948DD-500A-4A22-ABB9-5F4C59C28851%40yandex-team.ru
|
I'm new to pgsql native extenions. Should sequence_counter and previous_timestamp be protected by mutex? pg_uuidv7/sql/pg_uuidv7--1.6.sql Lines 7 to 8 in 65efc24
it's |
This code have several advantages:
The code is based on PG patch https://commitfest.postgresql.org/43/4388/ I have my repo https://github.com/x4m/pg_uuid_next , but I do not have energy to support it like you do. So I decided to bring some code from there to your implementation.
Have a nice day :)