-
Notifications
You must be signed in to change notification settings - Fork 0
/
Exercise2_4.txt
221 lines (136 loc) · 7.83 KB
/
Exercise2_4.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
Adversarial CHERI Exercises and Missions
========================================
EXERCISE 2.4: EXERCISE AN INTER-STACK-OBJECT BUFFER OVERFLOW
https://ctsrd-cheri.github.io/cheri-exercises/exercises/buffer-overflow-stack/index.html
================================
Q: This exercise demonstrates an inter-object buffer overflow on baseline and CHERI-enabled architectures, and asks you to characterize and fix the bug detected by CHERI bounds enforcement. It also asks you to use GDB for debugging purposes.
By contrast to the globals-based example [exercise 2.5], this example uses two stack objects to demonstrate the overflow. We will be able to see the CHERI C compiler generate code to apply spatial bounds on the capability used for the buffer pointer we pass around.
1. Compile buffer-overflow-stack.c for the baseline architecture to the binary buffer-overflow-stack-baseline
-----
A: Doesn't work on my local machine at all:
clang -o buffer-overflow-stack-baseline buffer-overflow-stack-baseline.c
buffer-overflow-stack-baseline.c:35:13: error: use of undeclared identifier 'ptraddr_t'
assert((ptraddr_t)upper == (ptraddr_t)&lower[sizeof(lower)]);
^
1 error generated.
---
A: Try on bencher14:
lang -o buffer-overflow-stack-baseline buffer-overflow-stack-baseline.c
buffer-overflow-stack-baseline.c:26:13: error: use of undeclared identifier 'ptraddr_t'
assert((ptraddr_t)upper == (ptraddr_t)&lower[sizeof(lower)]);
^
buffer-overflow-stack-baseline.c:26:13: error: use of undeclared identifier 'ptraddr_t'
2 errors generated.
--
A: Replace ptraddr_t with size_t for the baseline version, now it compiles:
chericode % gcc -o buffer-overflow-stack-baseline buffer-overflow-stack-baseline.c
helenoliver@Helens-MacBook-Air-3 chericode % ./buffer-overflow-stack-baseline
upper = 0x7ffee2338690, lower = 0x7ffee2338680, diff = 10
upper[0] = a
upper[0] = b
So as expected, it gets away with it.
==============
Q cont'd: and for the CHERI-aware architecture to buffer-overflow-stack-cheri.
---
A: Compile with make -f Makefile.morello-purecap buffer-overflow-stack-cheri
---
A: run on QEMU morello:
./buffer-overflow-stack-cheri
upper = 0xfffffff7ff6c, lower = 0xfffffff7ff5c, diff = 10
upper[0] = a
In-address space security exception (core dumped)
========
Q3. Using GDB on the core dump (or run the CHERI program under gdb): Why has the CHERI program failed?
----
A:
Starting program: /root/buffer-overflow-stack-cheri
upper = 0xfffffff7ff6c, lower = 0xfffffff7ff5c, diff = 10
upper[0] = a
Program received signal SIGPROT, CHERI protection violation.
Capability bounds fault.
0x0000000000110b90 in write_buf ()
---
A: What the source materials looked up was the address of upper. (gdb) info registers shows ours is: upper = 0xfffffff7ff6c
And that matches:
c0 0xdc5d40007f6cff5c0000fffffff7ff5c 0xfffffff7ff5c [rwRW,0xfffffff7ff5c-0xfffffff7ff6c]
c2 0xdc5d40007f6cff5c0000fffffff7ff5c 0xfffffff7ff5c [rwRW,0xfffffff7ff5c-0xfffffff7ff6c]
So upper starts at the end of c0 AND the end of c2, which weren't overlapping in exercise 2.2. Here, their bounds are the same.
----
A:
up
#1 0x0000000000110c98 in main ()
----
A:
(gdb) disass
Dump of assembler code for function main [cut for readability].
The screwup occurred right here, officer:
=> 0x0000000000110c98 <+248>: ldr c1, [csp, #64]
What led up to that?
// load csp + 80 into c0
0x0000000000110c88 <+232>: ldr c0, [csp, #80]
// move 16 into (32-bit) w8
0x0000000000110c8c <+236>: mov w8, #0x10 // #16
// move w8 (16) into w1 (so both are 32-bit and are now 16)
// (gdb) p w8
// No symbol "w8" in current context.
// (gdb) info reg w8
// w8 0x62 98
// (gdb) info reg w1
// w1 0xfffffff7feb8 281474976186040
0x0000000000110c90 <+240>: mov w1, w8
// so... we have w1 (32-bit c1)=16 and sizeof(w1)=16 and that gets branched to
// write_buf?
0x0000000000110c94 <+244>: bl 0x110b60 <write_buf>
// load csp + 64 into c1 (16 to the left of c0)
=> 0x0000000000110c98 <+248>: ldr c1, [csp, #64]
// load csp + 48 into c0 (16 to the left of c1... if it weren't for those pesky kids)
0x0000000000110c9c <+252>: ldr c0, [csp, #48]
What's in csp at that point where it crashes?
Take csp + 64 and put it in c1.
Previously we subtracted 208 from csp.
Now we are at position -144 from that starting point, after we add 64 to it.
The answers say:
"The compiler has arranged for main to allocate 144 bytes on the stack by decrementing the capability stack pointer register (csp) by 144 bytes."
So we're in the same place, at the indicated line, as they are in the answer set.
"Further, the compiler has placed lower 48 bytes up into that allocation:"
// Looks like our compiler has placed c0 80 bytes up into that allocation:
0x0000000000110c88 <+232>: ldr c0, [csp, #80]
// then it jumps to the routine (passing it 16, 16)
// and then it tries to place c1 64 bytes into the allocation, 16 bytes below c0
=> 0x0000000000110c98 <+248>: ldr c1, [csp, #64]
"ca0 is made to point at its lowest address and then the pointer to lower is materialized in cs0 by bounding the capability in ca0 to be 16 (sizeof(lower)) bytes long."
On our side, previously we had this line:
0x0000000000110c70 <+208>: scbnds c0, c0, #0x10
so the capability in c0 is bound to be 16 also.
And at a certain earlier point c1 was also bound to be 16 in a similar way. (Unless it's indirectly changed since)
So on their side, they've got c0 and c1 running into each other, and we have as well.
(We ended up with c0 and c2 having the same bounds addresses but that doesn't seem to be a cause here)
---
On baseline (my local machine) it won't compile.
So use bencher 14 as baseline.
It's clear what happened above, it had a capability in w1 which it passed to the function and CHERI was having none of it.
The full objdump (gdb didn't have any debug info, and I'm not going to re-try) is in objdump_from_buffer-overflow-stack-baseline.txt and the relevant lines are here:
// load effective address -32 of the local variable (buf?) to the return value
11e2: 48 8d 45 e0 lea -0x20(%rbp),%rax
// write 16 to low 32-bits of the 2nd argument
11e6: be 10 00 00 00 mov $0x10,%esi
// write the return value to the first argument
11eb: 48 89 c7 mov %rax,%rdi
// here's where the crime occurs
11ee: e8 52 ff ff ff callq 1145 <write_buf>
// move the 1-byte local variable value to the 4-byte low 32-bits of the return value
11f3: 0f b6 45 f0 movzbl -0x10(%rbp),%eax
// copy 1-byte low byte of return value,
// and sign-extend into 4-byte low 32-bits of the return value
11f7: 0f be c0 movsbl %al,%eax
I have a bit of a hard time following this but it's clear that it just swanned ahead with it whereas CHERI put a stop to it on the equivalent of this line:
// move the 1-byte local variable value to the 4-byte low 32-bits of the return value
11f3: 0f b6 45 f0 movzbl -0x10(%rbp),%eax
The point here being, according to the course materials, that in the vanilla version we have the ADDRESS of buf on the stack, and in the CHERI version we have a CAPABILITY
and it's in
0x0000000000110c70 <+208>: scbnds c0, c0, #0x10
that the CHERI version enforces the size of c0.
The course materials also emphasize that in both versions, write_buf stores without bounds checking. In the CHERI version output by gdb, I haven't got the write_buf section because it only output main and I'm not going to re-dump it. But in the baseline one, I have:
1159: 48 01 d0 add %rdx,%rax
115c: c6 00 62 movb $0x62,(%rax)
So yeah anyway I get it, more or less.