You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Apology if there is/are already existing similar proposal like this, but allow me to describe the requirement.
Background
As far as I am aware of today, the Go runtime provides two different ways to obtain a particular stacktrace.
runtime.Stack(buf, all=true) and runtime.Stack(buf, all=false). The debug.Stack is basically a convenient helper to produce a local goroutine stack with fixed 1K buf size.
In a large scale application [we are running enterprise level applications to support large and multiple services in cloud], we observed that there is a challenge to help debugging. The only local goroutine based stack trace is way to limited and often give us little clue as of how we got here. The full goroutine one is also way to costly to deploy into our deeper stack [we recently just noticed a performance regression due to deployed a self-implemented full Stack dump]. The full stack dump is fairly useful but we also need to focus on just the call chain we care about [there are a lot of other goroutines from various different kind of areas/libs which don't typically add any useful value.]
The question is, can the Go runtime provides a balanced view of a particular call chain that invoked from the [for example, runtime.Chain] call goroutine and all the way up?
This would be a particular useful signal to help only concrete on the goroutine that emitted such call and it is typically the one that received some kind of error [either due to internal processing or from external].
Our home-grown solution so far is to produce a runtime.Stack(64k, all=true) and then manually walk the goroutine section one by one via text-based parsing using a few heuristics, for example '^goroutine ' or ' created by ' to find the call relationship. This is not only very costly but also error-prone.
Thus, we'd like to see whether it is doable for the Go runtime to provide a new API, say runtime.Chain so that it could help produce a "current goroutine" based call chain dump.
Given that the current stack dump annotates goroutine " created by", I assume that the runtime does at least have some kind of internal bookkeeping already to reason about the call relationship. Since Go doesn't advocate goroutine-based programming and there isn't much else options out there [are there?], we turn to the Go team to seek help.
Some Proposed semantics:
It could be that producing such chain is still some non-trivial undertake, could it be possible to design some ABIs where the runtime could emit a list of goroutines IDs [or some cheap metadata if you still don't want to disclose implementation details] so that at least we could see a somewhat complete call stack? Or have some sort of object handle so to allow application to choose what to dump?
The order output is on the sequence of call chain upwards. Basically, the one called runtime.Chain would be the first entry in the output, and the caller of that second, and so forth. If whatever emitted metadata can be programmable with hints about the relationship, e.g. created by, then it is fine without any order and allow application to stitch those together.
It is possible that at the time when composing the chain, some of the goroutines already got terminated and purged out of the memory. I don't know the runtime detail enough to make any proposal here as of what happens if we see a gap, but I could assume that one potential option would be to end the chain. Often, with just one more or a couple more call chain, the debuggability can be greatly improved.
Thank you!
Jim
The text was updated successfully, but these errors were encountered:
We have not, as a matter of fact, I searched from the entire codebase and found zero references from any of our production services that utilized this flag... [fyi, I am actually talking about Google borg... :) ]
Proposal Details
Apology if there is/are already existing similar proposal like this, but allow me to describe the requirement.
Background
As far as I am aware of today, the Go runtime provides two different ways to obtain a particular stacktrace.
runtime.Stack(buf, all=true)
andruntime.Stack(buf, all=false)
. Thedebug.Stack
is basically a convenient helper to produce a local goroutine stack with fixed 1K buf size.full
goroutine one is also way to costly to deploy into our deeper stack [we recently just noticed a performance regression due to deployed a self-implemented full Stack dump]. The full stack dump is fairly useful but we also need to focus on just the call chain we care about [there are a lot of other goroutines from various different kind of areas/libs which don't typically add any useful value.]The question is, can the Go runtime provides a balanced view of a particular call chain that invoked from the [for example,
runtime.Chain
] call goroutine and all the way up?This would be a particular useful signal to help only concrete on the goroutine that emitted such call and it is typically the one that received some kind of error [either due to internal processing or from external].
Our home-grown solution so far is to produce a
runtime.Stack(64k, all=true)
and then manually walk the goroutine section one by one via text-based parsing using a few heuristics, for example '^goroutine ' or ' created by ' to find the call relationship. This is not only very costly but also error-prone.Thus, we'd like to see whether it is doable for the Go runtime to provide a new API, say
runtime.Chain
so that it could help produce a "current goroutine" based call chain dump.Given that the current stack dump annotates goroutine " created by", I assume that the runtime does at least have some kind of internal bookkeeping already to reason about the call relationship. Since Go doesn't advocate goroutine-based programming and there isn't much else options out there [are there?], we turn to the Go team to seek help.
Some Proposed semantics:
runtime.Chain
would be the first entry in the output, and the caller of that second, and so forth. If whatever emitted metadata can be programmable with hints about the relationship, e.g.created by
, then it is fine without any order and allow application to stitch those together.Thank you!
Jim
The text was updated successfully, but these errors were encountered: