Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Retry auth in checkClientCertAuthentication #19719

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

mvollmer
Copy link
Member

@mvollmer mvollmer commented Dec 8, 2023

No description provided.

@mvollmer mvollmer added the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Dec 8, 2023
@mvollmer mvollmer changed the title DBG - amplify DBG - TestAD.testClientCertAuthentication Dec 8, 2023
@mvollmer
Copy link
Member Author

mvollmer commented Dec 8, 2023

I did see this locally with 50% chance until I seriously started looking for reasons, then it stopped failing, of course. Current theory: it is something intermediate that goes away with retrying.

test/verify/check-system-realms Fixed Show fixed Hide fixed
test/verify/check-system-realms Fixed Show fixed Hide fixed
@mvollmer mvollmer force-pushed the debug-ad-client-certs branch 2 times, most recently from f260c01 to 483bc3c Compare December 8, 2023 16:05
@mvollmer
Copy link
Member Author

mvollmer commented Dec 8, 2023

Experiment results: The first password auth attempt in checkClientCertAuthentication sometimes fails with an internal error. Retrying it after five seconds always succeeds. Delaying the first try by ten seconds also always succeeds.

I have no idea why it fails the first time and why it succeeds the second time. I don't think we need to find out.

@mvollmer mvollmer force-pushed the debug-ad-client-certs branch from 483bc3c to bf04ab6 Compare December 8, 2023 16:44
@martinpitt
Copy link
Member

So perhaps the while ! id alice; do sleep 5; done check isn't sufficient, and we need to wait until password auth works? That might also explain some related flakes.

@mvollmer mvollmer force-pushed the debug-ad-client-certs branch 2 times, most recently from 5a8455d to f2a2784 Compare December 11, 2023 08:07
@mvollmer
Copy link
Member Author

So perhaps the while ! id alice; do sleep 5; done check isn't sufficient, and we need to wait until password auth works?

During my testing, id alice either worked on first try, or would never succeed. The latter is probably samba-in-kubernetes/samba-container#160.

How would you wait until passwd auth works? I think just retrying inside do_test is ok.

@mvollmer mvollmer force-pushed the debug-ad-client-certs branch from f2a2784 to 1638835 Compare December 11, 2023 08:43
@martinpitt
Copy link
Member

@mvollmer Note that samba-in-kubernetes/samba-container#160 applied to the other container image that we used in the meantime. I went back to https://github.com/Fmstrat/samba-domain, and it at least made that problem better. You can check sssctl domain-status cockpit.lan if it's the same problem, but that was very binary: It either worked right away or never. So if you can "fix" this by retrying, it's definitively something else.

Not sure about checking if pwd auth works -- in that case we can just retry the whole curl thing I suppose. There's only so much yak that we can and want to shave 😁

@mvollmer
Copy link
Member Author

@mvollmer
Copy link
Member Author

It either worked right away or never. So if you can "fix" this by retrying, it's definitively something else.

No, the while ! id alice loop will time out and the test will fail right there. The password auth will fail after id alice has succeeded on first try, and the succeed after a retry.

@mvollmer mvollmer force-pushed the debug-ad-client-certs branch 2 times, most recently from 036d4ff to 179b9f2 Compare December 11, 2023 09:03
@mvollmer mvollmer removed the no-test For doc/workflow changes, or experiments which don't need a full CI run, label Dec 11, 2023
@mvollmer mvollmer changed the title DBG - TestAD.testClientCertAuthentication test: Retry auth in checkClientCertAuthentication Dec 11, 2023
@mvollmer mvollmer marked this pull request as ready for review December 11, 2023 09:06
@mvollmer mvollmer requested a review from martinpitt December 11, 2023 09:12
Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Argh, the faulty flake8 unit-test container update does not affect just bots https://github.com/cockpit-project/bots/issues/5659 but also cockpit. That is urgent I suppose.

Note that this is still no-test.

Thanks!

test/verify/check-system-realms Outdated Show resolved Hide resolved
Password authentication sometimes fails on the first try.
@mvollmer mvollmer force-pushed the debug-ad-client-certs branch from 179b9f2 to 6dab14a Compare December 11, 2023 12:35
@mvollmer mvollmer requested a review from martinpitt December 11, 2023 12:35
Copy link
Member

@martinpitt martinpitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Danke! I bow to the bots in their infinite wisdom for the final verdict, of course. 🙇

@@ -427,11 +435,13 @@ class CommonTests:

# from sssd
self.allow_journal_messages("alice is not allowed to run sudo on x0. This incident will be reported.")
# occasional intermediate error during password auth
self.allow_journal_messages("cockpit-session: user account access failed: 4 alice: System error")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "4" is funny. But we can keep this for the time being, and replace with .* if/when it becomes a problem.

@martinpitt
Copy link
Member

martinpitt commented Dec 11, 2023

Meh, TestAD failure. That's a different test, but possibly same root cause?

another case

Retried.

@martinpitt
Copy link
Member

After 5 retries, I'll look the other way here.

@martinpitt martinpitt merged commit a61bb41 into cockpit-project:main Dec 11, 2023
88 of 91 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants