Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(tools/rand): speed up random string generation, take 2 #13186

Merged
merged 1 commit into from
Jun 12, 2024

Conversation

bungle
Copy link
Member

@bungle bungle commented Jun 7, 2024

Summary

This PR optimizes the kong.tools.rand.random_string.

Results can be seen in PR that this PR replaces:
#13150 (comment)

KAG-4704

@github-actions github-actions bot added the cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee label Jun 7, 2024
@bungle bungle force-pushed the refactor/speed_up_random_string_take_2 branch from 7eb5775 to 5425292 Compare June 7, 2024 10:15
@bungle bungle force-pushed the refactor/speed_up_random_string_take_2 branch 2 times, most recently from 7b1a440 to 6a51707 Compare June 7, 2024 10:43
@pull-request-size pull-request-size bot added size/S and removed size/M labels Jun 7, 2024
@chronolaw chronolaw self-requested a review June 11, 2024 08:34
kong/tools/rand.lua Outdated Show resolved Hide resolved
kong/tools/rand.lua Outdated Show resolved Hide resolved
kong/tools/rand.lua Show resolved Hide resolved
@fffonion
Copy link
Contributor

fffonion commented Jun 11, 2024

left my comment in the old PR for archive purpose, but also posting here #13150 (comment)

Edit (by @bungle) copying it here:

I'm not referring the openssl rand_bytes vs. /dev/urandom is securer or weaker, but it depends on our usage.
There are three points makes read from /dev/urandom + base64 not a good design:

We have to use math.rand to map extra character range +, /, thus the output is a mix of cryptographic secure source and non cryptographic secure source.

In our code, we have places we just need random strings, thus it doesn't have to be CSPRNG. We can just use math.random to generate fast and non cryptographic secure random strings.
Use of /dev/urandom is not FIPS compliant (unless the OS itself is FIPS compliant), as it uses non approved algorithm, if the output is used as secret.

So I'm just suggesting to have a pure lua math.rand flavor and a openssl rand bytes variant (no encode_base64, just use to_hex) to random_string. Something like random_string(non_csprng) and default to generate CSPRNG, and can fallback to fast math.random if need high performance (for example, generate trace IDs).

@bungle
Copy link
Member Author

bungle commented Jun 11, 2024

We have to use math.rand to map extra character range +, /, thus the output is a mix of cryptographic secure source and non cryptographic secure source.

Yes, the + and / are really rare. They usually occur none to 2 times in most of the cases. I think this was a compromise. An original implementation was like this:

return v4_uuid():gsub("-", "")

I think original idea was to make it more secure because there was issues reported because of above (the same issue is with the proposed math.rand as is it almost the same).

In our code, we have places we just need random strings, thus it doesn't have to be CSPRNG.

Right but original commit definitely wanted to make it harder for reason, this is the commit:
c01752a

So I'm just suggesting to have a pure lua math.rand flavor and a openssl rand bytes variant (no encode_base64, just use to_hex) to random_string.

We originally had almost the pure math.rand flavor, which the commit referenced above wanted to change for reason.

I think this is out of scope of this work. We can have separate PR for that, and also discuss should we have variable length ones in separate PR. The code here does not change things. It speeds it up. The to_hex effectively cuts the random string entropy to half. With wider char space like base62 you don't do it as much (e.g. here we almost retain the 192 bits of entropy, not fully in all the cases, but almost - with hex you would need longer output string).

I suggest we put better ones in PDK. The better ones could take in account:

  1. maximum entropy in shortest amount of text chars (baseXXX?)
  2. copy-pasteability (does double click select the whole thing?)
  3. readability (e.g. do not use chars that looks like each other)
  4. CSPRNG vs non-CSPRNG
  5. variable length outputs
  6. ascii / utf-8 safety, perhaps specific use cases like idna etc.
    etc.

In general there are many things to consider.

### Summary

This PR optimizes the `kong.tools.rand.random_string`.

Results can be seen in PR that this PR replaces:
#13150 (comment)

KAG-4673

Signed-off-by: Aapo Talvensaari <[email protected]>
@bungle bungle force-pushed the refactor/speed_up_random_string_take_2 branch from 6a51707 to 7d991d2 Compare June 11, 2024 09:26
@pull-request-size pull-request-size bot added size/M and removed size/S labels Jun 11, 2024
@bungle bungle merged commit 9a215eb into master Jun 12, 2024
33 checks passed
@bungle bungle deleted the refactor/speed_up_random_string_take_2 branch June 12, 2024 08:52
@team-gateway-bot
Copy link
Collaborator

Successfully created cherry-pick PR for master:

@fffonion
Copy link
Contributor

We originally had almost the pure math.rand flavor, which the commit referenced above wanted to change for reason.

I understand. But the commit reference doesn't actually make it any securer. The usage or base64 + rand mapping we
are trying to improve in this PR is not valid at first. So I was suggesting to just abandon the current design and invest
effort into properly design it, at least to solve the use cases in kong codebase. And it will make our FIPS mode more robust
too. I created a ticket to track the future work on this.

locao pushed a commit that referenced this pull request Jun 21, 2024
### Summary

This PR optimizes the `kong.tools.rand.random_string`.

Results can be seen in PR that this PR replaces:
#13150 (comment)

KAG-4673

Signed-off-by: Aapo Talvensaari <[email protected]>
(cherry picked from commit 9a215eb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee size/M skip-changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants