Split deep vmem_alloc()/vmem_xalloc() stacks #94
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
In #90
a user reported panics on an M1 with the message
"Invalid kernel stack pointer (probable overflow)."
In at least several of these a deep multi-arena allocation
was in progress (several vmem_alloc/vmem_xalloc reaching
all the way down through vmem_bucket_alloc,
xnu_alloc_throttled, and ultimately to osif_malloc).
The stack frames above the first vmem_alloc were also fairly large.
This commit sets a dynamically sysctl-tunable threshold
(8k default) for remaining stack size as reported by xnu.
If we do not have more bytes than that when vmem_alloc()
is called, then the actual allocation will be done in a
separate worker thread which will start with a nearly
empty stack that is much more likely to hold the various
frames all the way through our code boundary with the
kernel and beyond.
The xnu / mach thread_call API (osfmk/kern/thread_call.h)
is used to avoid circular dependencies with taskq, and the
mechanism is per-arena costing a quick stack-depth check
per vmem_alloc() but allowing for wildly varying stack
depths above the first vmem_alloc() call.
Vmem arenas now have two further kstats: the lowest amount
of available stack space seen at a vmem_alloc() into it,
and the number of times the allocation work has been done
in a thread_call worker.
These are small functions with no or very few automatic
variables that were good candidates for clang/llvm's
inlining heuristics before we switched to building
the kext with -finline-hint-functions.
previous commits, eliminating a couple compile-time warnings
Motivation and Context
Description
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by
.