Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(clustering): ensure data plane config hash is never nil #11917

Merged
merged 2 commits into from
Nov 7, 2023

Conversation

flrgh
Copy link
Contributor

@flrgh flrgh commented Nov 4, 2023

The Bug

The previous logic defaulted the config_hash to nil when it was detected to be an empty string. This can cause update_sync_status() to fail, because config_hash is a required attribute:

> 2023/11/03 17:13:30 [debug] 4052224#0: *150 [lua] connector.lua:560: execute(): SQL query throw error: ERROR: null value in column "config_hash" of relation "clustering_data_planes" violates not-null constraint
> Failing row contains (4fb29006-8db1-48bb-b68c-34b582e1d91a, soup, 127.0.0.1, 2023-11-04 00:13:30+00, null, 2023-11-18 00:13:30.799+00, 3.6.0, filter_set_incompatible, 2023-11-04 00:13:30+00, {})., close connection
> 2023/11/03 17:13:30 [notice] 4052224#0: *150 [lua] init.lua:275: upsert(): ERROR: null value in column "config_hash" of relation "clustering_data_planes" violates not-null constraint

Why was config_hash an empty string? Because we blindly update this variable from the data received from ping frames from the data plane

The Fix

This change addresses the problem from two angles:

  1. when empty or invalid, config_hash is set to the default DECLARATIVE_EMPTY_CONFIG_HASH constant instead of nil
  2. an additional guard was added to the dp reader thread, which checks the length of ping frame data and returns an error if it is not a proper config hash

Checklist

  • The Pull Request has tests
  • A changelog file has been created under changelog/unreleased/kong or skip-changelog label added on PR if changelog is unnecessary.
  • There is a user-facing docs PR N/A

Issue reference

KAG-2967

@flrgh
Copy link
Contributor Author

flrgh commented Nov 4, 2023

Will add a changelog when I'm back at my desk next week, but the code is complete and ready for review in the mean time. done

This makes the test less complicated and easier to debug on failure.
@flrgh flrgh force-pushed the fix/clustering-empty-config-hash branch from e53f614 to 017629c Compare November 6, 2023 17:39
@flrgh flrgh marked this pull request as ready for review November 6, 2023 17:39
The previous logic defaulted the config_hash to nil when it was
detected to be an empty string. This can cause update_sync_status() to
fail, because config_hash is a required attribute:

> 2023/11/03 17:13:30 [debug] 4052224#0: *150 [lua] connector.lua:560: execute(): SQL query throw error: ERROR: null value in column "config_hash" of relation "clustering_data_planes" violates not-null constraint
> Failing row contains (4fb29006-8db1-48bb-b68c-34b582e1d91a, soup, 127.0.0.1, 2023-11-04 00:13:30+00, null, 2023-11-18 00:13:30.799+00, 3.6.0, filter_set_incompatible, 2023-11-04 00:13:30+00, {})., close connection
> 2023/11/03 17:13:30 [notice] 4052224#0: *150 [lua] init.lua:275: upsert(): ERROR: null value in column "config_hash" of relation "clustering_data_planes" violates not-null constraint

This change addresses the problem from two angles:

1. when empty, config_hash is set to the default DECLARATIVE_EMPTY_CONFIG_HASH
   constant instead of nil
2. an additional guard was added to the dp reader thread, which checks the
   length of ping frame data and returns an error if it is not a proper
   config hash
@flrgh flrgh force-pushed the fix/clustering-empty-config-hash branch from 017629c to 5b420ed Compare November 7, 2023 00:11
@hanshuebner hanshuebner merged commit 083ab25 into master Nov 7, 2023
25 checks passed
@hanshuebner hanshuebner deleted the fix/clustering-empty-config-hash branch November 7, 2023 13:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants