forked from pizlonator/llvm-project-deluge
-
Notifications
You must be signed in to change notification settings - Fork 0
/
invisicap.txt
276 lines (234 loc) · 14.4 KB
/
invisicap.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
This document outlines a new capability model for Fil-C called InvisiCap. InvisiCap is a possible
replacement for the current Fil-C MonoCap model. The goal of InvisiCap is that it:
- Aims at super fast checks of all int (i.e. non-ptr) accesses. Because int accesses outnumber ptr
accesses, this should result in a big speed-up.
- Aims at ptr accesses that are no worse than in Fil-C right now (and are possibly better).
- Allows ptr-int unions where you can flip-flop between ptr and int.
-> Storing an int where there used to be a ptr changes where the ptr points, but retains the old
capability. So, int stores only result in a valid ptr if the address is in bounds of whatever
that pointer used to point to.
-> Storing a ptr where there used to be an int is fine. Future int reads will see the ptr's raw
value.
This removes one of the big reasons why Fil-C is currently incompatible with some C code and changes
are needed to make that C code work in Fil-C.
- Retains atomicity of ptrs so long as they are marked volatile, _Atomic, or std::atomic (just like
MonoCaps) and there is no atomic int versus atomic ptr type confusion. If these conditions are
violated, then memory safety still holds, but you might get weird outcomes (either you don't get the
atomicity or ints and ptrs stored to the same location behave as if they were at different
locations). See bottom of the doc for an analysis of atomicity.
- Does not require CAS on stores or loads, except exactly one CAS the first time that a ptr is stored
into an object. This should be a big speed-up over MonoCaps, which require CAS on first access to
each word (and this shows up in profiles).
- Gives the illusion of pointers being 64-bit. This removes another compatibility hurdle (lots of code
just assumes that pointers are either 32-bit or 64-bit and nothing else).
-> There's nothing in the algorithm that requires 64-bit systems; this approach would work fine on
32-bit systems (you'd get the illusion that ptrs are 32-bit, as you'd expect on such a system).
- Does not sacrifice memory safety at all.
- Gives a path to ABI compatibility with Yolo-C.
- Gives a path to removing the need for a GC (if we're willing to have a weaker guarantees for use-
after-free; those weaker guarantees are still technically memory safe).
InvisiCap introduces a different space trade-off than MonoCap. MonoCaps require 1 byte for every 16
bytes and a 32 byte object header. InvisiCaps require 8 bytes for every 8 bytes in any object that has
at least one pointer, but zero otherwise, plus a 16 byte header. So, InvisiCaps are more efficient for
integer-only objects, more efficient for pointer-only objects, equally efficient for small objects
that mix integers and pointers, and less efficient for large objects that mix integers and pointers.
The capability object precedes the allocation, always. It's:
struct filc_object {
size_t size;
uintptr_t aux; /* 16 bits hold flags, 48 bits holds the aux ptr. */
};
Newly allocated memory always gets a filc_object header, the aux starts out zero, the size is the
object's size (rounded up to 8 bytes), and the payload is always initialized to zero.
Pointers are a tuple of the raw pointer and the lower bound, i.e. the end of the capability object.
So, filc_ptr_lower(ptr) points at the tail end of the filc_object, and that's what the capability
pointer carries. So, to get the filc_ptr_object(ptr), we do ((filc_object*)filc_ptr_lower(ptr)) - 1.
And to support atomics, we have:
struct PAS_ALIGNED(16) filc_atomic_box {
void* lower;
void* ptr;
};
For normal objects, the object->aux & FILC_AUX_PTR_MASK pointer points at an array of either lowers or
atomic boxes. The aux entries are tagged with FILC_ATOMIC_BOX_BIT (i.e. (uintptr_t)1) to indicate that
the lower points at a filc_atomic_box.
The aux array may not exist (for int-only objects). If it exists, it has exactly the same number of
bytes as the main object. Integers in the object have no corresponding entry in aux (it will be zero).
Pointers in the object will have the raw pointer in the main object and the capability pointer (in the
form of a pointer to the lower bound, i.e. a pointer to the "top" of the filc_object) in the
corresponding entry in aux. Finally, atomic pointers (pointers that were accessed using atomics) will
have a filc_atomic_box pointer in the corresponding entry in aux, and that box will have the
authoritative value of that atomic pointer. But, the pointers value will be non-atomically replicated
into the main object payload, so nonatomic integers loads get *some* value that is reasonably recent.
Aux is created lazily, on first pointer store. Atomic boxes are created lazily, on first atomic
pointer store.
In the case of specials, size is zero, preventing all access. Flags tells what kind of special object
we have.
In the case of function pointers, aux & FILC_AUX_PTR_MASK points to the actual function. In the case
of all other specials, the aux points to wherever we say it's legal for the pointer to point in order
to actually use it.
Pointers at rest always have lower pointing at the tail of a filc_object, and never at an atomic box.
It's still the case that pointers-at-rest have the filc_ptr type, which carries both the lower and the
raw_ptr. But pointers stored in the heap are usually split - the raw_ptr is in the main object payload
while the lower is in the aux.
We rely on the following helpers:
void* filc_ptr_lower(filc_ptr); /* Returns the lower portion of the ptr. This is zero actual work
since the lower will be carried in a register. */
filc_object* filc_ptr_object(filc_ptr); /* Returns the filc_object (i.e. lower minus object
size). Also zero actual work, since the compiler will
just generate object accesses as negative offsets to
lower. */
void* filc_ptr_ptr(filc_ptr); /* The raw pointer. Zero actual work since the raw ptr will be
carried in a register. */
filc_ptr filc_ptr_create(void* lower, void* raw_ptr); /* Wrap up a lower/raw_ptr into a filc_ptr.
Zero actual work. */
void filc_object_make_aux(filc_object*); /* Allocate an aux for the object. Fast inlined object
allocation plus a CAS to set the low 48 bits of aux
without disturbing the flags. The payload of the aux
is always initialized to zero on allocation. */
char* filc_object_aux_ptr(filc_object*); /* Return the pointer portion of the aux (the low 48
bits). I.e. object->aux & FILC_AUX_PTR_MASK. Returned
as a char* because all uses involve arithmetic in
terms of bytes. */
void filc_store_barrier(filc_object*); /* Execute a GC store barrier when the value being stored
points at the given object. */
filc_atomic_box* filc_atomic_box_create(filc_ptr value); /* Allocate an atomic box and initialize
it to the given ptr. */
filc_ptr filc_atomic_box_load_atomic(filc_atomic_box*); /* 128-bit atomic load. */
void filc_atomic_box_store_atomic(filc_atomic_box*, filc_ptr); /* 128-bit atomic store. */
An integer read (atomic or not) looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
return *(uintptr_t*)filc_ptr_ptr(ptr);
An integer write (atomic or not) looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_object(ptr)->flags & FILC_OBJECT_FLAG_READONLY)
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
*(uintptr_t*)filc_ptr_ptr(ptr) = new_value;
An atomic integer CAS looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
return CAS((uintptr_t*)filc_ptr_ptr(ptr), expected_value, new_value);
A pointer read (atomic or not) looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
if (!filc_object_aux_ptr(filc_ptr_object(ptr)))
return filc_ptr_create(NULL, *(void**)filc_ptr_ptr(ptr));
uintptr_t lower_or_box = *(uintptr_t*)(
filc_object_aux_ptr(filc_ptr_object(ptr)) + (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr)));
if ((lower_or_box & FILC_ATOMIC_BOX_BIT)) {
return filc_atomic_box_load_atomic(
(filc_atomic_box*)(lower_or_box & ~FILC_ATOMIC_BOX_BIT)box);
}
return filc_ptr_create((void*)lower_or_box, *(void**)filc_ptr_ptr(ptr));
A nonatomic pointer write looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_object(ptr)->flags & FILC_OBJECT_FLAG_READONLY)
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
if (!filc_object_aux_ptr(filc_ptr_object(ptr)))
filc_object_make_aux(filc_ptr_object(ptr));
filc_store_barrier(filc_ptr_object(new_value));
*(void**)(filc_object_aux_ptr(filc_ptr_object(ptr)) + (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr))) =
filc_ptr_lower(new_value);
*(void**)filc_ptr_ptr(ptr) = filc_ptr_ptr(new_value);
An atomic pointer write looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_object(ptr)->flags & FILC_OBJECT_FLAG_READONLY)
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
if (!filc_object_aux_ptr(filc_ptr_object(ptr)))
filc_object_make_aux(filc_ptr_object(ptr));
filc_store_barrier(filc_ptr_object(new_value));
uintptr_t* lower_or_box_ptr = (uintptr_t*)(
filc_object_aux_ptr(filc_ptr_object(ptr)) + (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr)));
uintptr_t lower_or_box = *lower_or_box_ptr;
if ((lower_or_box & FILC_ATOMIC_BOX_BIT)) {
filc_atomic_box_store_atomic(
(filc_atomic_box*)(lower_or_box & ~FILC_ATOMIC_BOX_BIT), new_value);
} else {
*lower_or_box_ptr = (uintptr_t)filc_atomic_box_create(new_value) | FILC_ATOMIC_BOX_BIT;
An atomic pointer CAS looks like:
if (misaligned)
fail;
if (!filc_ptr_lower(ptr))
fail;
if (filc_ptr_object(ptr)->flags & FILC_OBJECT_FLAG_READONLY)
fail;
if (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr) >= filc_ptr_object(ptr)->size)
fail;
if (!filc_object_aux_ptr(filc_ptr_object(ptr)))
filc_object_make_aux(filc_ptr_object(ptr));
filc_store_barrier(filc_ptr_object(new_value));
uintptr_t* lower_or_box_ptr = (uintptr_t*)(
filc_object_aux_ptr(filc_ptr_object(ptr)) + (filc_ptr_ptr(ptr) - filc_ptr_lower(ptr)));
for (;;) {
uintptr_t lower_or_box = *lower_or_box_ptr;
if ((lower_or_box & FILC_ATOMIC_BOX_BIT)) {
return filc_atomic_box_cas_atomic(
(filc_atomic_box*)(lower_or_box & ~FILC_ATOMIC_BOX_BIT), expected_value, new_value);
}
if ((void*)lower_or_box == filc_ptr_lower(expected_value) &&
*(void**)filc_ptr_ptr(ptr) == filc_ptr_ptr(expected_value)) {
if (!CAS(lower_or_box_ptr, lower_or_box,
(uintptr_t)filc_atomic_box_create(new_value) | FILC_ATOMIC_BOX_BIT))
continue;
return true;
}
return false;
}
Note that the CAS algorithms are guaranteed to linearize (be atomic and have semantics that make
sense) if certain conditions are met:
- There's no int-ptr type confusion.
- Ptr fields are marked volatile, _Atomic, or std::atomic.
Pointer accesses are guaranteed to linearize if the last store to the location was either an atomic
pointer store or an atomic pointer CAS. Pointer accesses are also guaranteed to linearize in the
following corner case: the last store was either an integer store or a nonatomic pointer store, but
it happened without any racing. The initialization of an object upon allocation satisfies that
requirement. The moment that a nonatomic pointer store race happens, then it's possible for the race
outcome to be that the pointer has the capability from one store and the raw ptr value from the other
store. It's also possible that if those stores race with a CAS, then the CAS might appear to fail or
succeed even though it shouldn't have.
A more succinct way to think about it is: a memory location is either in atomic box mode, or not. It
goes into atomic box mode once there is an atomic ptr store or ptr CAS. Atomic box mode is cancelled
as soon as there is a nonatomic ptr store.
Integer accesses are atomic no matter what. But: if the integer access aliases a pointer access, then
the outcome of that aliasing is different depending on whether the location is in atomic box mode. If
the location is not in atomic box mode, then the integer accesses alias the raw ptr portion of the
whatever pointer is there. For example, you could do:
*(void**)X = some_pointer;
*(intptr_t)X = 666;
borked_ptr = *(void**)X;
And then borked_ptr will have the value 666, but the capability from some_pointer. But if the memory
location is in atomic box mode, then the outcome will be different: borked_ptr will be exactly
some_pointer, and subsequent integer reads from X will see 666.
To summarize, atomics really are atomic - for both integers and pointers - so long as there's no
int-ptr type confusion and atomically accessed ptr fields are marked volatile, _Atomic, or
std::atomic. If there are atomic loads to ptrs not marked atomic, then those accesses will still be
atomic provided that the location is in atomic box mode (the last store was an atomic ptr store or ptr
CAS) or the last store didn't race. If there is int-ptr type confusion, then the memory location
carries totally separate values for the integer accesses and the atomic ptr accesses, as opposed to
aliasing the raw ptr to the integer value of that location.