Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use key to identify log entries from the same request. #199

Closed
ryansanford opened this issue Mar 15, 2016 · 10 comments
Closed

Use key to identify log entries from the same request. #199

ryansanford opened this issue Mar 15, 2016 · 10 comments
Assignees

Comments

@ryansanford
Copy link
Contributor

On a busy system, when an error occurs within scitran/core, it can difficult and time consuming to identify which log entries relate to the same request.

Suggest enhancement to create a key for each request, and include it in every log entry in a way that is easy to grep out, like (request_key:abc123def456). Implementation should avoid request_key collisions between threads for a sane number of requests.

For reference, my uwsgi log file with 1986753 lines is 138423095 bytes in size.

@nagem
Copy link
Contributor

nagem commented Mar 15, 2016

I'm thinking a few random alphanumerics mixed with a date-time stamp would keep us unique enough.

@gsfr
Copy link
Member

gsfr commented Mar 16, 2016

Wondering if webapp2 might already have something in store for us. Might be worth checking self and self.request.

@kofalt
Copy link
Contributor

kofalt commented Mar 16, 2016

I've seen this used before to great effect: https://github.com/newobj/nginx-x-rid-header

Works out of the box; just log the header :)
Also good in that it works against any other components in the stack - hint: engine, etc.

@gsfr
Copy link
Member

gsfr commented Mar 16, 2016

I poked at webapp2 requests for a sec while looking into #200. Doesn't look like webapp2 offers a unique request id out of the box. Would be easy enough to generate one using the uuid module and store in on the request (self).

@kofalt's proposal looks simple, but locks us into nginx, which I'd prefer to avoid. Sure, it looks like the obvious choice today, but we thought the same thing about uwsgi at some point. Now, we're less sure.

I don't understand the multi-component remark. What's the core/engine interaction for web requests?

@kofalt
Copy link
Contributor

kofalt commented Mar 17, 2016

The engine and reapers route through nginx, so they'd also be able to log the traced header in their logs. This allows us to trace a request through a system, which is critical.

The industry pattern is to attach these tracers at the outer layer, so I'd be happier to see them added there, at our rev-proxy. If we were taking bets, my money would be on nginx being the last currently-existing component to be switched out, plus it's trivial to generate guids in other rev-proxy products.

If we're planning to stick with New Relic in the longer term (my read on that situation is "no", due to the cost), it might be worth checking out their plugin to see if they can enable this feature. It wasn't clear from what I saw.

@gsfr
Copy link
Member

gsfr commented Mar 18, 2016

@ryansanford Are you comfortable compiling nginx from source, which the suggested module requires?

@ryansanford
Copy link
Contributor Author

My suggestion is to define and document a header that all scitran/core logging will use as a key, and provide UUID generation on entry, or first log event if that header is not provided. That allows 3rd parties the flexibility to generate and use this request_key outside the scitran/core stack if they wish.

Another thing to consider now or down the road is forwarding this request_uuid header within requests to other scitran instances.

Implementing the request_key generation using nginx or any other http proxy/host or even the client then is not a direct concern for this ticket, or scitran/core in general. (yes, I'm comfortable compiling, but we could get similar behavior with nginx-extras package and perl or lua.)

@ryansanford
Copy link
Contributor Author

@rentzso @kofalt
Externally defined request_key could also be a plus with automated unit testing. If each test sends its own distinct request_key, on failure we could provide or highlight only the log entries for the tests that fail.

Requirement would be that the http host for the unit tests would not overwrite this request_key header.

@kofalt
Copy link
Contributor

kofalt commented Mar 18, 2016

If I'm understanding you correctly, you're asking that we define a header which, if present, is logged by the API (and other components when/if applicable), and we can leave the methodology of generating that header value in your court? That sounds entirely reasonable to me.

One slight modification is that I wouldn't want to allow clients (user clients at least) to provide their own identifiers, since they're not trusted actors.

@coltonlw
Copy link
Contributor

Docs for LoggerAdapter, a way to add contextual information to a logger:
https://docs.python.org/2/library/logging.html#loggeradapter-objects
https://docs.python.org/2/howto/logging-cookbook.html#context-info

Snippet for generating a request id:

import time
import uuid

request_id = "{random_chars}-{timestamp}".format(
    timestamp = str(int(time.time())),
    random_chars = str(uuid.uuid4().hex)[:8]
    )

print(request_id)

a7c49271-1471524694

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants