Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(dns): function to force no sync toip() #12739

Merged
merged 3 commits into from
Mar 22, 2024
Merged

Conversation

locao
Copy link
Contributor

@locao locao commented Mar 15, 2024

Summary

This change adds new individual_toip() public function to the DNS client.

This function bypasses the noSynchronisation = false option, forcing individual resolve of the hostnames. It avoids race-conditions when multiple calls to the access() phase handler happen in the context of the same request.

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary. README.md
  • There is a user-facing docs PR against https://github.com/Kong/docs.konghq.com - PUT DOCS PR HERE

Issue reference

KAG-3795

@github-actions github-actions bot added the cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee label Mar 15, 2024
@locao locao added the pr/wip A work in progress PR opened to receive feedback label Mar 15, 2024
@locao locao force-pushed the feat/force_dns_no_sync branch 3 times, most recently from 28ea225 to 27c5b5e Compare March 18, 2024 18:40
@locao locao marked this pull request as ready for review March 18, 2024 19:25
@chobits
Copy link
Contributor

chobits commented Mar 19, 2024

hi @locao as I’m trying to implement a new DNS client library based on mlcache (see PR #12305), and in that implmentation, the DNS query operation was implemented in the mlcache L3 callback. It deprecates the dns_no_sync option and multiple DNS queries for the same name will always be synchronized (even across workers) because it ultilizes mlcache concurrent control mechanism.

So I have one concern that if we update kong to the new DNS library, we might not support individual_toip().

If the wasm module must rely on a singleton DNS library, I need to reconsider how to handle it in the new DNS library. (I currently haven't come up with a good method to implement individual queries on top of mlcache, as this would completely disrupt mlcache's optimal usage pattern and make the code more complex. 😿 )

@chobits chobits force-pushed the feat/force_dns_no_sync branch from f4fa6f0 to 9cd917c Compare March 19, 2024 10:17
@locao locao removed the pr/wip A work in progress PR opened to receive feedback label Mar 19, 2024
@locao
Copy link
Contributor Author

locao commented Mar 19, 2024

Hi @chobits. If the new DNS client library behaves exactly the same way for the wasm module as for the lua land, there's nothing to worry about.

As the current DNS client is not being removed at the moment, I think we should move forward with this change. Maybe we can change the individual_toip() name to something that makes sense to exist in the new DNS client library, or add a compatibility function in the new library.

What do you think?

@locao locao force-pushed the feat/force_dns_no_sync branch from 9cd917c to 9dbe37f Compare March 19, 2024 20:14
@chobits
Copy link
Contributor

chobits commented Mar 20, 2024

Hi @chobits. If the new DNS client library behaves exactly the same way for the wasm module as for the lua land, there's nothing to worry about.

Yes, it behaves the same way for all the upper callers. I have take a thorough look at the KAG issue, and know that your concern is the race-condition of current DNS client library. But in new DNS client library we still have that race-condition caused by the internal use of resty-lock by mlcache. Could that work for wasm?

BTW, I know in that KAG issue, there is a semaphore lock race condition in the scenario of wasm using the DNS library. Could you explain why wasm could not tolerant the race condition if there is no dead-lock problem and semaphore lock seems not to cause CPU blocking. And I think it could help kong alleviate the pressure of queries on the DNS server.

As the current DNS client is not being removed at the moment, I think we should move forward with this change. Maybe we can change the individual_toip() name to something that makes sense to exist in the new DNS client library, or add a compatibility function in the new library.

It does not matter for the function name. I can provide a same function for wasm to be compatible with current DNS library.

What do you think?

@chobits chobits force-pushed the feat/force_dns_no_sync branch from 9dbe37f to 0e5d244 Compare March 20, 2024 05:55
@thibaultcha
Copy link
Member

The old DNS library is problematic for the ngx_wasm_module Lua bridge because of its use of the semaphore Lua library, which the no_sync option disables. Using the lock library that mlcache relies on in the new DNS resolver should not cause any problem to the WasmX Lua bridge, so we should be good once we do the switch in the future...

Copy link
Contributor

@chobits chobits left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I discussed with @locao offline, the WASM module could be responsible for compatibility with both new and old libraries, like

-- In the wasm module:
local toip = dns_client.toip

-- For original dns client library, it will use its `individual_toip` function.
if dns_client.individual_toip then
  topip =  dns_client.individual_toip
end

@locao locao force-pushed the feat/force_dns_no_sync branch from 0e5d244 to 5eeaea2 Compare March 22, 2024 13:27
@locao locao merged commit 48cd3a0 into master Mar 22, 2024
26 checks passed
@locao locao deleted the feat/force_dns_no_sync branch March 22, 2024 14:33
@team-gateway-bot
Copy link
Collaborator

Successfully created cherry-pick PR for master:

locao added a commit that referenced this pull request Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick kong-ee schedule this PR for cherry-picking to kong/kong-ee size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants