From 7de82019df5815727bad6507d1188b41e9e82582 Mon Sep 17 00:00:00 2001 From: Michael Primeaux Date: Tue, 29 Oct 2024 14:30:42 -0500 Subject: [PATCH] Added Nano ID collision calculator --- CHANGELOG/CHANGELOG-1.x.md | 2 +- docs/nanoid-collision-calculator.html | 359 ++++++++++++++++++++++++++ 2 files changed, 360 insertions(+), 1 deletion(-) create mode 100644 docs/nanoid-collision-calculator.html diff --git a/CHANGELOG/CHANGELOG-1.x.md b/CHANGELOG/CHANGELOG-1.x.md index 28fe23e..059b06f 100644 --- a/CHANGELOG/CHANGELOG-1.x.md +++ b/CHANGELOG/CHANGELOG-1.x.md @@ -10,7 +10,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added ### Changed -- **DEBT:** Check for duplicate characters using a bitmask with multiple `uint32`s. A `uint32` array can represent `256` bits (`32` bits per `uint32 × 8 = 256`). This allows us to track each possible byte value without the limitations of a single uint64 ### Deprecated ### Removed ### Fixed @@ -20,6 +19,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [1.6.0] - 2024-OCT-29 ### Added +- **FEATURE:** Added [Nano ID collision calculator](../docs/nanoid-collision-calculator.html). ### Changed - **DEBT:** Check for duplicate characters using a bitmask with multiple `uint32`s. A `uint32` array can represent `256` bits (`32` bits per `uint32 × 8 = 256`). This allows us to track each possible byte value without the limitations of a single uint64 ### Deprecated diff --git a/docs/nanoid-collision-calculator.html b/docs/nanoid-collision-calculator.html new file mode 100644 index 0000000..fe6778b --- /dev/null +++ b/docs/nanoid-collision-calculator.html @@ -0,0 +1,359 @@ + + + + + NanoID Collision Time Calculator + + + + +
+

NanoID Collision Time Calculator

+
+ + + + +
+ + +
+ + +
+ + +
+ + +
+ + +
+ Mathematical Explanation: +
+ To determine the time required to reach a 1% probability of at least one collision when generating NanoIDs, we use the following mathematical formula derived from the birthday paradox: +

+ Formula: +
+ n = √(-2 × N × ln(1 - P)) +

+ Where: +
    +
  • n = Total number of IDs needed to reach the target probability.
  • +
  • N = ak = Total number of possible unique IDs, where a is the alphabet size and k is the ID length.
  • +
  • P = Target collision probability (in this case, 0.01 for 1%).
  • +
  • ln = Natural logarithm.
  • +
+
+ By rearranging the formula, we can solve for n, and subsequently determine the time required based on the rate of ID generation. +

+ Example Calculation: +
+ If you have an alphabet size of 64 characters and an ID length of 21, the total number of possible unique IDs N is: +
+ N = 6421 ≈ 1.20892582 × 1038 +
+ To reach a 1% collision probability: +
+ n = √(-2 × 1.20892582 × 1038 × ln(0.99)) ≈ 1.555 × 1018 +
+ If you generate 1,000 IDs per hour, the time t required is: +
+ t = n / rate = 1.555 × 1018 / 1,000 = 1.555 × 1015 hours ≈ 176,136,364 Years +

+ Note: This calculation assumes that each ID is generated independently and that the probability of generating the same ID multiple times remains constant throughout the generation process. +
+
+ + + + +