Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenSSL commands for troubleshooting certificate issues. #1426

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

arooshap
Copy link
Member

@arooshap arooshap commented Oct 12, 2023

Hi @khurtado, @amaltaro, @todor-ivanov,

I have tried to enlist as many details and commands which can the WMCore team to debug and troubleshoot their services.

@arooshap arooshap linked an issue Oct 12, 2023 that may be closed by this pull request
3 tasks
@khurtado
Copy link
Contributor

@arooshap This is great, I only made a few minor comments. Thank you so much for working on this!

@arooshap
Copy link
Member Author

@khurtado I don't see your comments. Could you please let me know where you made them?

Thanks.

@khurtado
Copy link
Contributor

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arooshap Aroosha, this documentation is looking pretty good, thank you for putting effort on this!

I left a few comments/questions for your consideration. In addition to that, I have some more general comments, such as:

  • perhaps explanation of Calico and CoreDNS should be moved somewhere else? To my understand they are part of the k8s architecture, so maybe moving it under https://cms-http-group.docs.cern.ch/k8s_cluster/architecture/ would be more meaningful(?)
  • I also feel like some of the debugging documented here is duplicated with the "Troubleshooting" section in cmsweb-docs. Perhaps some of the common kubectl commands could be removed. Or even better, putting a pointer to the cmsweb-docs troubleshooting at the top of this debugging documentation?

In the end I didn't want it to be a WMCore specific debugging, but something that we can share with other cmsweb kubernetes tenants.
@khurtado what do you think? If you have a different view on how this documentation and/or my comments should be, please speak up.

openssl x509 -noout -subject -in file.pem

```
- Who is in charge of updating the certificates?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to clarify which certificates we are talking about here. Is it the robot/service certificates used by the backend services (like those used in a few WM services)?

- Where are the service certificates for the dmwm services located?

- Exec into the pod.
- cd /data/srv/current/auth/reqmgr2ms/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do they get mounted? where do they come from (/etc/... ?)?

@@ -0,0 +1,206 @@
## Debugging issues with the certificates.

- How do I change p12 to pem?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say these 3 lines about how to use openssl command could be removed. But it's up to you.

- **Key Responsibilities**:
- **Network Connectivity**: Calico Node assigns a unique IP address to each pod on the node, enabling network connectivity for the pods.
- **Network Policy Enforcement**: It enforces network policies defined in Kubernetes that control pod-to-pod communication, allowing or denying traffic based on policy rules.
- **Routing**: Calico Node uses the Border Gateway Protocol (BGP) to establish and manage routing between nodes, ensuring that pods can communicate across the cluster.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably out of scope: but why do we need to enable communication across the pods? According to CMSWEB design, isn't the traffic always supposed to go through the frontends?

When you want to add a new service to be hosted under CMSWeb, you have to take the following steps:
- The [deployment/frontend](https://github.com/dmwm/deployment/tree/master/frontend)
repository contains SSL and NOSSL redirect rules for individual services or namespaces. They are labeled as app\_\<service-name\>\_ssl.conf and app\_\<service-name\>\_nossl.conf. You have to insert redirect rules there.
- Then, we need to insert rules in backends-k8s-prod.txt and backends-k8s-preprod.txt to let the frontends know where we would like our requests to be redirected. For K8s, you just need to supply a single `backends.txt` file with proper rules. This file is present in the [services_config/-/tree/cmsweb/frontend-ds](https://gitlab.cern.ch/cmsweb-k8s/services_config/-/tree/cmsweb/frontend-ds?ref_type=heads) branch for the production cluster, for preproduction cluster, it is present in the [services_config/-/tree/preprod/frontend-ds](https://gitlab.cern.ch/cmsweb-k8s/services_config/-/tree/preprod/frontend-ds?ref_type=heads), and for test cluster, it is present in [services_config/-/tree/test/frontend-ds](https://gitlab.cern.ch/cmsweb-k8s/services_config/-/tree/test/frontend-ds?ref_type=heads).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aroosha, do you think a table could be created for the environment vs branch hosting the frontends configuration? We might want to decide to place this content in a different place as well.

14. **Check API Resources**:
- Get a list of available API resources with: `kubectl api-resources`.

15. **Check Component Status**:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for my ignorance. But what are these components supposed to be? Are they documented somewhere in cmsweb-docs?

@khurtado
Copy link
Contributor

khurtado commented Nov 8, 2023

While I agree with this, I think a reference in this document to that section for both compoenents would be useful for readers as well. While these are parts of the k8s architecture, it's the main non-CMS specific 2 components to look for errors when debugging issues.

  • I also feel like some of the debugging documented here is duplicated with the "Troubleshooting" section in cmsweb-docs. Perhaps some of the common kubectl commands could be removed. Or even better, putting a pointer to the cmsweb-docs troubleshooting at the top of this debugging documentation?

I agree with this.

@todor-ivanov
Copy link
Contributor

Hi @arooshap Thank you for this documentation. It looks great. I have only one general remark.
I hope you have a full dedicated documentation section somewhere else for those two components : Calico and CoreDNS
If not, please consider creating one. This information I found quite enlightening, so to my opinion it is worth a separate page for those.

@khurtado
Copy link
Contributor

Hi @arooshap Did you have a chance to look into the PR review comments?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CMSWeb Troubleshooting Documentation for DMWM.
4 participants