Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server side tracing fails to capture encrypted traffic for forking servers #2030

Open
ddelnano opened this issue Sep 20, 2024 · 0 comments
Open
Labels
area/datacollector Issues related to Stirling (datacollector)

Comments

@ddelnano
Copy link
Member

ddelnano commented Sep 20, 2024

Describe the bug
Certain servers fork after accepting a connection from a client. This is commonly referred to as a forking web server model, but also occurs for non HTTP servers. For example, postgres is well known for serving each client query in a separate process.

In many web servers, there are a worker pool of processes that are kept on standby for handling requests (pre-forking servers). These processes are longer lived since they continue to service future requests once they are idle. If these workers are processing TLS, it means that if they are long lived Pixie will eventually be able to attach its TLS uprobes to it. If the processes are short lived and are continuously cycling through PIDs (like in postgres's case), Pixie will be unable to catch the process and miss the traffic.

Potential Solutions

Since the main server process is long lived, we need to investigate if it's possible to uprobe the main process and have the child processes inherit the instrumentation. We wouldn't be able to blindly do this as I think we would catch processes that aren't pre-forking servers.

Short term Solution

For newer protocols, Pixie has made the decision to trace server side only. This means that if the client and server are within the cluster, the client's traces are dropped. For this situation, since postgres is never traced properly, it results in no data. It would be beneficial in this case to have a runtime override to enable client side tracing for certain protocols. This would give end users that use postgres with TLS a way to capture their traffic without needing to enable client side tracing for all protocols.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy a postgres client and server that uses TLS within the same cluster (setup that causes Pixie to ignore the client side traces)
  2. See that no traces are captured unless client side postgres tracing is used

Expected behavior
That Pixie is able to perform server side TLS tracing on postgres.

Logs
While debugging this with a member of the community, we captured the following logs to come to this issue's conclusion.
postgres-strace.txt
postgres-server-side-pem.log.zip

App information (please complete the following information):

  • Pixie version: 0.14.11
  • K8s cluster version: N/A
  • Node Kernel version: N/A
@ddelnano ddelnano added the area/datacollector Issues related to Stirling (datacollector) label Sep 20, 2024
@ddelnano ddelnano changed the title Server side tracing fails to capture encrypted traffic for certain pre-fork servers Server side tracing fails to capture encrypted traffic for forking servers Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/datacollector Issues related to Stirling (datacollector)
Projects
None yet
Development

No branches or pull requests

1 participant