Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compartment example security #55

Merged
merged 1 commit into from
Jan 12, 2022

Conversation

0152la
Copy link
Contributor

@0152la 0152la commented Dec 16, 2021

In this example, we tighten security even further. For each compartment we
create, we store some capabilities of interest. Essentially, we setup
certain entry points, and make them available to each compartment.
Currently, these involve the PCC/DDC of the switch_compartment function
(PCC to execute, DDC to access memory where we store information pertaining
to compartments; essentially, switch_compartment performs a limited
privelege escalation, but only over prepared data within main).
Additionally, we also seal the PCC of switch_compartment.

@0152la
Copy link
Contributor Author

0152la commented Dec 16, 2021

I'm more or less putting this here now to see if anyone has any ideas if we can break this model. I'll probably take a bit of time to tinker with that. I also have not yet implemented exact function bounds (as per #54), so that is one know vector of attack. But as is, I believe a compartment can now not access data regarding other compartments (which is available within switch_compartment).

* Currently, these involve the PCC/DDC of the `switch_compartment` function
* (PCC to execute, DDC to access memory where we store information pertaining
* to compartments; essentially, `switch_compartment` performs a limited
* privelege escalation, but only over prepared data within `main`).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sp "privilege"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 3955d6a.

* | |
* | HEAP |
* | |
* | saved caps |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is included in "saved caps"? I guess these are the "some capabilities of interest" mentioned earlier, but I must admit that I don't know precisely what that set might be!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is detailed in main.c:4:

Currently, these involve the PCC/DDC of the switch_compartment function (PCC to execute, DDC to access memory where we store information pertaining to compartments

Further, there is a saved_caps variable a bit down. I can add in an underscore to make the connection more obvious?

size_t id = 0;

// Duplicate certain capabilities within the heap of each compartments;
// currently we save DDC and PCC of `switch_compartment`
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If another compartment can arbitrarily read/write to the compartment switcher I think, in general, that it can get the compartment switcher to do whatever it wants. But, if switch_compartment is a sentry (which I think it is), we don't need its PCC to be stored outside? The DDC is an interesting question: I'm not sure how to handle that. Can a sentry also change the DDC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I don't quite follow. I think there are three issues at play here:

  1. The minimal amount of data a malicious compartment requires in order to be able to "take control" of switch_compartment (sentence 1).
  2. Requiring the PCC of switch_compartment to be "stored outside" (I don't understand where "outside" is). (sentence 2)
  3. Issue of DDC integrity (sentence 3-4)

For 1, we need to balance the minimal information switch_compartment needs in order to actually perform the switch. We obviously need something to be exposed in order for the function to be executable (at minimum, we can say the address of switch_compartment, but of course that is useless without knowing what compartment we want to switch to, or where that compartment is in memory). If the minimal information needed to perform the switch and the minimal data needed to take control of switch_compartment are not disjoint, then this is a design problem, and this design is infeasible.

For 2, I will clarify that the switch_compartment available to the compartments is a sentry, which can only be branched to. I need further clarification regarding the "outside" thing. For 3, what does "a sentry [..] change the DDC" mean? A sentry is a sealed capability with it's value being the address of a function. By that definition, a sentry is changing the DDC, as the DDC swapping happens inside switch_compartment (although the DDC is expected in a register, provided as an argument by the calling compartment).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although the DDC is expected in a register, provided as an argument by the calling compartment

I think that's a concern. What stops the caller from providing an inappropriate DDC? However, in this case I think it can only cause a crash in the switcher (by providing an inadequate DDC). That might be acceptable in the (currently implicit) threat model.

If it's necessary to bind a pair (to prevent this), it's possible to do that using ldpblr, as we've discussed before.

mov x12, sp
add x11, x10, #COMP_OFFSET_STK_ADDR
ldr x11, [x19, x11]
// add x13, x10, #COMP_OFFSET_STK_LEN
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we prefer to delete commented code (unless there's a really good reason to keep it around).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. Removed in 3955d6a.

@0152la
Copy link
Contributor Author

0152la commented Dec 17, 2021

After further talks, I believe we conclude that leaking the DDC of the compartment switcher is unacceptable. Therefore this design is inappropriate in its current incarnation, unless we can somehow communicate compartment information to the switcher function without needing to expose this DDC capability.

@ltratt
Copy link
Collaborator

ltratt commented Dec 17, 2021

I think that's a fair summary of our current thoughts.

@0152la
Copy link
Contributor Author

0152la commented Dec 17, 2021

Would you say sealing the DDC (to be unsealed somehow in switch_compartment work?

Copy link
Contributor

@jacobbramley jacobbramley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'll need another pass for my own understanding. I'm not sure if I yet understand what guarantees each compartment (including the switcher) can reasonably make, for example.

@@ -0,0 +1,194 @@
/* In this example, we tighten security even further. For each compartment we
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which example does it follow on for? To what does "even further" refer? (I can guess, but it may not be obvious later.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified in 9fe7172.


/* Abstract representation of a compartment. Within 80 bytes we represent:
* - an id (8B)
* - address to the start of the compartment's stack (8B)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the "start" the lowest address, or the highest (initial) address?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified in db7617b.

void init_comps()
{
void *__capability switch_cap = (void *__capability) switch_compartment;
switch_cap = cheri_bounds_set(switch_cap, 80 * 4);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that a guess at the size of the switch_compartment function? You can make this automatic (and tight) with some assembly directives, though I'm not sure how to make it a constant. For now, perhaps comment or assign it to a named variable, so the intention is clear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is related to #54 . I have not implemented this yet, but at least lifting it for the future would be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better (not necessarily good) way of getting function sizes has been added in db7617b.

Comment on lines 125 to 126
asm("mov c19, %w0\n\t"
"mov x0, #0\n\t"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to make sure (basically) that the C compiler doesn't mess with those registers between setting them, and reading them in switch_compartment. The only safe way to do that is to make sure that there is no C code in between, which mean writing this whole function in assembly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the only thing following is the call to switch_compartment, would it be safe to simply include that function call as part of the inline assembly, maybe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can, but you'll have to mark all potentially-clobbered registers as such, and there are a lot of them. (All the caller-saved registers, basically.) It can be done if that's what you need to do, but it's often easier to put the whole lot into a .s (or .S) file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. So the naive solution then is to move executive_switch in our shared.S file, or is that still insufficient?

Comment on lines 125 to 127
asm("mov c19, %w0\n\t"
"mov x0, #0\n\t"
"msr CID_EL0, c0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More generally — and this requires some thought — does splitting the actual switching not represent a security concern? For example, what happens if I call switch_compartment directly? Nothing seems to prevent me from doing so, and then I can provide whatever value in c19 that I want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is a concern. The problem with the current design is that in order to access the compartment information (which we want to keep secure from compartments) in memory, we need to set the DDC over that region of memory. I can't think of a way to self-determine this in switch_compartment, without some expected cooperation from the compartment we are transitioning from.

As for why you can't call switch_compartment directly (assuming from within a compartment), it is that the PCC for the compartment you are in does not cover switch_compartment, and the only way to transition is via the provided sentry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the time of writing, I would have argued that holding the sentry is sufficient to make the call, since a malicious caller could provide a mis-matched DDC (or ID from which it is derived, or similar). However, after our recent call I now think you've resolved that using ldpblr anyway. Is that the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is correct - the compartment would have the PCC (albeit sealed), so it could call switch_compartment and give it a malformed value in c19 (now c29), which could lead to something unwanted. Now, the compartment only is aware of a sealed capability pointing to some address (which happens to contain the unsealed switcher DDC), but that capability can only be used via lpb-like instructions (including ldpblr), with set parameters (i.e., the DDC/PCC combo set in memory, outside of the compartment's DDC bounds).

hybrid/compartment_examples/inter_comp_call_secure/main.c Outdated Show resolved Hide resolved
/* This function stores the compartment data in memory, so that it can be
* accessed by the compartments as needed.
*/
void executive_switch(struct comp c)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't see that this actually uses executive mode. What have I missed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "executive" is not related to the notion of executive/restricted in Morello, just bad naming we could say. I will edit it to make sure there is no such confusion.


asm("mov c19, %w0\n\t"
"mov x0, #0\n\t"
"msr CID_EL0, c0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that passing this in a general-purpose register might be better. It can be set in switch_compartment, rather than set here and read there.

The real answer here rather depends on what the ID actually means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, honestly, c19 is more a remnant of previous iterations. It can be anything, or if we can change the design of switch_compartment to not require this argument be passed somehow...

If by ID you mean the contents of CID_EL0, it represents the ID of the compartment we are jumping into (which would, of course, require some knowledge between compartments of what compartment corresponds to what ID). This currently corresponds to the index of the comps array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant CID_EL0. One of the first things that switch_compartment does is to read the value out of CID_EL0 (into c10), to use it to index the comps array, but then why not just leave it in a general-purpose register (c0) and let switch_compartment choose if or how to put it in CID_EL0? That seems like a better abstraction to me, and a clearer interface because it looks more like a conventional function call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, there is no practical reason, beyond showing one way of using this "special-purpose-but-not-really" CID register. I can either add a comment to emphasise this, or we can use c0 or any other register, if we prefer to have a "cleaner" example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to set it, but we should do so in switch_compartment. Here, we should just pass the value in c0 (or similar). I don't think using CID_EL0 as a parameter-passing register would be considered a good idea.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the use of CID_EL0 as discussed in 47e39b7.

@0152la
Copy link
Contributor Author

0152la commented Jan 7, 2022

I have pushed a new commit making use of ldpblr, as discussed, and obfuscating the switcher's required capabilities.

@ltratt
Copy link
Collaborator

ltratt commented Jan 7, 2022

This looks very promising! Can we create some "negative examples" (for want of a better term) in hybrid/compartment_examples that show what happens when code tries doing naughty things? For example, we could have a few examples which try to read the DDC from switch_compartment and check that they fail.

[At the moment we lack a nice testing infrastructure on CHERI for this stuff. @drpdr used shell scripts elsewhere in the repo to automatically check which programs should succeed and which should fail -- that's still the best approach I can think of.]

@0152la
Copy link
Contributor Author

0152la commented Jan 10, 2022

Another rather meaty update, fixing memory allocation within the compartment (again, for no current functional use), and a few other tidbits. I think the only thing left is to remove the usage of CID_EL0; I'll want to do so probably early tomorrow, along with a clang-format, to use the new version of the build in the CI, as per our earlier remarks (I have not updated the build as I was working on the PR). Otherwise, if I missed anything, it would be useful to resolve some of the conversations we have lying around, since it's really hard to properly look for them with all these outdated versions of the PR.

// When creating a compartment, store a local copy of the capability which
// will allow us to call `switch_compartment` in the heap of the compartment.
memcpy(new_comp.heap_addr, &switcher_call, sizeof(void *__capability));
void * heap_top = (void *) (_start_addr + total_comp_size - sizeof(void* __capability));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor formatting: s/void * heap_top/void *heap_top/ and s/void* __capability/void *__capability/.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should be fixed once I apply clang-format, which I have not quite yet. Should be shortly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied clang-format in 5d529ed.

@ltratt
Copy link
Collaborator

ltratt commented Jan 10, 2022

I haven't fully groked this yet, but my first impression is positive! Are you going to push a negative example or two to this PR? I think that might help me understand the details a bit better.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

I think that we should finalize this PR first, considering both its age and the amount of discussion around it, before adding negative examples in a follow-up PR. However, I have nothing against including those here, if we think that's reasonable.

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

Let's add 1 simple negative example to this PR, just as a proof of concept, then I agree: we can merge, and the next PR can add lots more negative examples.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

Added a negative example, where we try to dereference a local capability for switch_compartment access. The dereference predictably fails due to it being sealed.

I believe at this point, the only unresolved issue [1] is moving executive_switch from C to the assembly file, in order to ensure no clobbering of the registers we need set up. I'll have a go at that.

[1] #55 (comment)

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

Please squash as you see fit.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

I've also pushed the final change of moving executive_switch in assembly, to prevent potential register clobbering at the level of C. I'll squash after the meeting today, just in case there are any further things left to do.

Furthermore, I've tried running both this example, as well as some of the previous ones on our daily CHERI build (with @drpdr's help), but I encountered some issues. As of now, none of the examples I have added (after ddc_compartment_switching_sentry) have been, or will be, executed outside my local dev environment.

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

What issues? We should fix them before merging unless there's a very good reason not to.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

No issues with the infrastructure itself, but difficulties in trying to manually make use of it. The final step was to get the correct SSH key to connect to, which I'll bother Pino about at the some point. Just that, overall, trying to manually use our nightly builds could be more streamlined.

The one "issue" I would say it has is that it runs a very limited set of examples, which need to be manually provided [1]. This includes not ensuring that a newly provided example does indeed run on the build (but it does check that it compiles, as I was told). I would put this with overall infrastructure improvements we could be making to the repo, as we should also improve the overall organization and build system.

[1] https://github.com/capablevms/cheri-examples/blob/master/tests/run_tests.sh#L61

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

I agree that we can improve things: but we should hook this new example into buildbot.sh (directly or indirectly) in this PR. That way we will make sure that we pick up future enhancements to this code too :) If you push a commit to that effect, I think we can get this merged.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

I know how to add that code, but I have yet to be able to validate it locally. I can either try to get it working (at worst, I'll just update my dev environment), or we can just push it blindly, which I'm not keen on, but we can fix stuff once we see it's broken I guess.

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

The good thing about pushing it is that bors won't merge it if it doesn't work! So go ahead.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

Good point. I've added this, and inter_comp_call to the tests.

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

Excellent! Since you're not sure if they work or not I recommend you do a bors try first to see if it works.

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

bors try

bors bot added a commit that referenced this pull request Jan 11, 2022
@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

Our standard practise in this situation is to keep pushing fix commits and doing bors try until it works :) [You can see the failure by clicking on the "buildbot/capablevms-test-script" test script FWIW.]

@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

bors try

bors bot added a commit that referenced this pull request Jan 11, 2022
@bors
Copy link
Contributor

bors bot commented Jan 11, 2022

try

Build succeeded:

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

@0152la Time to squash?

@0152la 0152la force-pushed the inter_comp_call_secure branch from f41d67f to 4c499b9 Compare January 11, 2022 18:30
@0152la
Copy link
Contributor Author

0152la commented Jan 11, 2022

@ltratt I assumed, after our conversation, that I would integrate the failing example in this PR. I can do that when I add the rest of the examples in the CI, in a future PR then, or I can just update this PR tomorrow if we want the failing example in as well.

@ltratt
Copy link
Collaborator

ltratt commented Jan 11, 2022

Ah, sorry, I thought the failing example was in here! Let's link it in -- we might as well :)

@0152la
Copy link
Contributor Author

0152la commented Jan 12, 2022

After trying lots of things, I decided, at the moment, to just have some duplication and move some stuff in their independent directories. I will do some tries now to see if I got everything right.

@0152la
Copy link
Contributor Author

0152la commented Jan 12, 2022

bors try

bors bot added a commit that referenced this pull request Jan 12, 2022
@bors
Copy link
Contributor

bors bot commented Jan 12, 2022

try

Build failed:

@0152la
Copy link
Contributor Author

0152la commented Jan 12, 2022

bors try

bors bot added a commit that referenced this pull request Jan 12, 2022
@bors
Copy link
Contributor

bors bot commented Jan 12, 2022

try

Build failed:

@0152la
Copy link
Contributor Author

0152la commented Jan 12, 2022

bors try

bors bot added a commit that referenced this pull request Jan 12, 2022
@bors
Copy link
Contributor

bors bot commented Jan 12, 2022

try

Build succeeded:

@0152la
Copy link
Contributor Author

0152la commented Jan 12, 2022

@ltratt @jacobbramley I think we're set. This final set of commits reorganized 3 examples: the main correct example of this PR, the negative example of this PR, and the one in #53. It also adds all these three examples to our CI.

@ltratt
Copy link
Collaborator

ltratt commented Jan 12, 2022

Unless @jacobbramley has any comments, please squash.

In this example, we tighten security even further. We store the DDC/PCC
of `switch_compartment` within its memory region. Then, for each
compartment, we create a copy of a capability pointing to this DDC/PCC
pair, sealed with `lpb`. This allows compartments to call
`switch_compartment` without having access to any information pertaining
to it.

We provide an additional example (`shared_try_deref.S`), where a
malicious compartment attempts to dereference its local copy, to gain
access to `switch_compartment`.
@0152la 0152la force-pushed the inter_comp_call_secure branch from d852147 to 7f1e1ea Compare January 12, 2022 15:54
@0152la
Copy link
Contributor Author

0152la commented Jan 12, 2022

Squashed; can of course address further comments, if there are any.

@ltratt
Copy link
Collaborator

ltratt commented Jan 12, 2022

I think it's best to get this in, and we can improve it in further PRs if necessary.

bors r+

@bors
Copy link
Contributor

bors bot commented Jan 12, 2022

Build succeeded:

@bors bors bot merged commit 5836ec3 into capablevms:master Jan 12, 2022
Copy link
Contributor

@jacobbramley jacobbramley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks roughly correct but I'll look more closely later. I think all the earlier threads are resolved.

Comment on lines +176 to +177
// 40 is arbitary; meant to be the size of the executable function within
// compartment
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you addressed that with precise pointer arithmetic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I forgot to update this comment. I'm unsure if we can amend this PR as is, after merging, or should this be done as part of another PR? @ltratt

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is merged, so it can be done in another PR. I'm fine with small PRs that fix such issues!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants