Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kernel refuses to start if task's initial stack frame spans regions #1943

Open
Tracked by #1967
cbiffle opened this issue Dec 10, 2024 · 0 comments
Open
Tracked by #1967

Kernel refuses to start if task's initial stack frame spans regions #1943

cbiffle opened this issue Dec 10, 2024 · 0 comments
Assignees
Labels
kernel Relates to the Hubris kernel

Comments

@cbiffle
Copy link
Collaborator

cbiffle commented Dec 10, 2024

The exhubris tools are turning out to be a pretty good kernel torture test. 😄

During the reinitialize operation that starts (and restarts) a task, the kernel (on ARM M-profile at least) needs to deposit an initial stack frame in the task's stack memory, from which the hardware will load the task's initial register values. Before we do that, we "zap" the stack, overwriting it with a predictable pattern to help monitor a task's stack usage high-watermark.

The code in question is here:

// Before we set our frame, find the region that contains the top word of
// the stack -- one word below the initial stack pointer -- and zap the
// region from the base to the stack pointer with a distinct (and storied)
// pattern.
//
// Note that if the initial stack pointer is zero, we use saturating
// arithmetic and get zero for the top word, which is outside any region and
// causes this to be skipped. (Not that we expect zero, but we're the kernel
// and we don't trust tasks.)
if let Some(region) = task
.region_table()
.iter()
.find(|region| region.contains(initial_stack.saturating_sub(4)))
{
let mut uslice: USlice<u32> = USlice::from_raw(
region.base as usize,
(initial_stack - frame_size - region.base as usize) >> 2,
)
.unwrap_lite();
let zap = task.try_write(&mut uslice).unwrap_lite();
for word in zap.iter_mut() {
*word = 0xbaddcafe;
}
}

That code assumes that the memory region containing the topmost word in the stack (initial_stack_ptr.saturating_sub(4)) also contains the the entire stack, because the kernel uses region.base to define the place to start zapping.

If the initial stack frame (104 bytes on Cortex-M4, thanks largely to the FPU) does not fit in the region that contains the topmost word of the stack, you instead get an invalid USlice constructed using region.base and a large negative number. (If you're building with overflow checks enabled, you get a subtraction with overflow panic. I disabled overflow checks to help find the real problem.) USlice rightfully rejects this as nonsense, which the code here really wasn't expecting, since it calls unwrap_lite on the result. Panic.

This code dates to the early days of Hubris when a task would always have a single contiguous RAM region containing the stack. In mainline "Oxide Hubris" that's still true in practice: we mostly apply the region-splitting allocator to flash.

The exhubris tools split both RAM and flash, deciding which to split next based on the potential cost of fragmentation given their sizes. This means I wound up with a task with a 256 byte stack split over no fewer than five regions (sizes: 32, 64, 64, 64, 32). In that case, it takes the last three regions to cover the 104-byte initial stack frame, and hilarity ensues. Where by "hilarity" I mean "the application refusing to start."

Interestingly, the actual code for writing the frame (just below) works fine if I disable the stack zap code, because it doesn't try to be clever with regions. So, the zap code itself needs to be improved to handle more elaborate RAM region splits.

I may land an initial fix that just makes the zap code optional, since the stack zapping method is inexact anyway (#1872).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kernel Relates to the Hubris kernel
Projects
None yet
Development

No branches or pull requests

1 participant