-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fix TestAuthenticationAndHBARules stress tests SSL issue #132729
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @pritesh-lahoti and @souravcrl)
pkg/sql/pgwire/auth_test.go
line 575 at r1 (raw file):
} if err != nil { if !strings.Contains(err.Error(), "SSL is not enabled on the server") {
let's use errors.Is(err, pq.ErrSSLNotSupported)
rather than a string match
https://github.com/lib/pq/blob/3d613208bca2e74f2a20e04126ed30bcb5c4cc27/conn.go#L34
pkg/sql/pgwire/auth_test.go
line 579 at r1 (raw file):
} // Retry: server might fail to respond to upgrade conn under stress dbSQL, err = gosql.Open("postgres", dsn)
hmm i'd like to understand a little bit more about how this might help before merging. let me dive into this a little bit first.
69b511c
to
e3ede16
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @pritesh-lahoti and @rafiss)
pkg/sql/pgwire/auth_test.go
line 575 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
let's use
errors.Is(err, pq.ErrSSLNotSupported)
rather than a string matchhttps://github.com/lib/pq/blob/3d613208bca2e74f2a20e04126ed30bcb5c4cc27/conn.go#L34
done
pkg/sql/pgwire/auth_test.go
line 579 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
hmm i'd like to understand a little bit more about how this might help before merging. let me dive into this a little bit first.
have taken a note of this. will await your reply.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @pritesh-lahoti and @souravcrl)
pkg/sql/pgwire/auth_test.go
line 579 at r1 (raw file):
Previously, souravcrl wrote…
have taken a note of this. will await your reply.
i think we should try to gather more info on the failure before going with a blind retry. (unless, have you already done some of this investigation?)
i think a patch like this will help us see exactly which step of upgrading the secure conn is not working:
diff --git a/pkg/sql/pgwire/pre_serve.go b/pkg/sql/pgwire/pre_serve.go
index 66d1d81f86c..d0bca6747c1 100644
--- a/pkg/sql/pgwire/pre_serve.go
+++ b/pkg/sql/pgwire/pre_serve.go
@@ -23,6 +23,7 @@ import (
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgerror"
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgwirebase"
"github.com/cockroachdb/cockroach/pkg/sql/pgwire/pgwirecancel"
+ "github.com/cockroachdb/cockroach/pkg/util/buildutil"
"github.com/cockroachdb/cockroach/pkg/util/log"
"github.com/cockroachdb/cockroach/pkg/util/metric"
"github.com/cockroachdb/cockroach/pkg/util/mon"
@@ -446,6 +447,10 @@ func (s *PreServeConnHandler) maybeUpgradeToSecureConn(
// Secure mode: disallow if TCP and the user did not opt into
// non-TLS SQL conns.
if !s.cfg.AcceptSQLWithoutTLS && connType != hba.ConnLocal && connType != hba.ConnInternalLoopback {
+ // Extra logs under test to debug TestAuthenticationAndHBARules.
+ if buildutil.CrdbTestBuild {
+ log.Warningf(ctx, "client cannot connect since version=%d AcceptSQLWithoutTLS=false and connType=%s", version, connType)
+ }
clientErr = pgerror.New(pgcode.ProtocolViolation, ErrSSLRequired)
}
return
@@ -458,6 +463,10 @@ func (s *PreServeConnHandler) maybeUpgradeToSecureConn(
// we don't want it.
clientErr = pgerror.New(pgcode.ProtocolViolation,
"cannot use SSL/TLS over local connections")
+ // Extra logs under test to debug TestAuthenticationAndHBARules.
+ if buildutil.CrdbTestBuild {
+ log.Warningf(ctx, "client cannot connect since version=%d and connType=%s", version, connType)
+ }
return
}
@@ -479,6 +488,10 @@ func (s *PreServeConnHandler) maybeUpgradeToSecureConn(
if tlsConfig == nil {
// We don't have a TLS configuration available, so we can't honor
// the client's request.
+ // Extra logs under test to debug TestAuthenticationAndHBARules.
+ if buildutil.CrdbTestBuild {
+ log.Infof(ctx, "sending sslUnsupported message to client")
+ }
n, serverErr = conn.Write(sslUnsupported)
if serverErr != nil {
return
we can merge it, and then next time it fails we should have a helpful clue. what do you think about this?
informs cockroachdb#131532 informs cockroachdb#131110 informs cockroachdb#130253 informs cockroachdb#127745 Epic CRDB-41958 TestAuthenticationAndHBARules fails for `special_cases`, `hba_default_equivalence`, `empty_hba` data driven tests for secure mode. The failures occur when root user is trying to authenticate with cert-password auth method and `sslmode` is set to `verify-ca` with `sslcert` being empty. The expected behavior is root authentication defaults to password method and fails as no password is set for root, but instead we get: ``` SSL is not enabled on the server ``` Since the failures are there only under stress, it might be because db server shutdown or paused before responding to request for upgrade connection to SSL from lib/pq client from here https://github.com/lib/pq/blob/3d613208bca2e74f2a20e04126ed30bcb5c4cc27/conn.go#L1116-L1130. Retrying connection establishment when this specific error is obtained might fix the problem as this logic seems faulty(it checks for absence of 'S' in server response whereas the correct check should be for 'N' in response). Release note: None
Epic CRDB-41958 Additional test build logs are added to verify the step for which `maybeUpgradeToSecureConn` fails for `TestAuthenticationAndHBARules`. Release note: None
0e1ff0d
to
c81da1f
Compare
sql: fix TestAuthenticationAndHBARules stress tests SSL issue
informs #131532
informs #131110
informs #130253
informs #127745
Epic CRDB-41958
TestAuthenticationAndHBARules fails for
special_cases
,hba_default_equivalence
,empty_hba
data driven tests for secure mode. The failures occur when root user is trying to authenticate with cert-password auth method andsslmode
is set toverify-ca
withsslcert
being empty. The expected behavior is root authentication defaults to password method and fails as no password is set for root, but instead we get:Since the failures are there only under stress, it might be because db server shutdown or paused before responding to request for upgrade connection to SSL from lib/pq client from here
https://github.com/lib/pq/blob/3d613208bca2e74f2a20e04126ed30bcb5c4cc27/conn.go#L1116-L1130. Retrying connection establishment when this specific error is obtained might fix the problem as this logic seems faulty(it checks for absence of 'S' in server response whereas the correct check should be for 'N' in response).
pgwire: add test build logs for upgrade secure conn
Additional test build logs are added to verify the step for which
maybeUpgradeToSecureConn
fails.Release note: None