-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
205 lines (170 loc) · 12.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
<!DOCTYPE html>
<html>
<head>
<title>Efficient Computing at Carnegie Mellon</title>
<base target="_blank">
</head>
<body>
<h1>Efficient Computing at Carnegie Mellon</h1>
<p>A list of recent open-source research systems developed by the research group of
<a href="https://www.cs.cmu.edu/~dga">Professor David G. Andersen</a> and
<a href="https://www.cs.cmu.edu/~kaminsky">Dr. Michael Kaminsky</a>
in the <a href="https://www.cmu.edu">Carnegie Mellon University</a>
<a href="https://csd.cs.cmu.edu">Computer Science Department</a>.
<h2>Data structures</h2>
<h3><a name="feedforward">Feed-Forward Bloom Filter</a></h3>
<p>Memory efficient and cache-optimized algorithm for simultaneously searching for a large number of
patterns in a very large corpus
[<a href="https://epubs.siam.org/doi/abs/10.1137/1.9781611972917.1">SIAM ALENEX 2011</a>]</p>
<p><a href="https://github.com/efficient/ffbf">Reference implementation</a></p>
<h3><a name="rankselect">Rank & Select Structures on Uncompressed Bit Sequences</a></h3>
<p>Rank & select structures that use a cache-centric design approach to achieve performance
competitive with the highest-performance prior designs while imposing space overhead as low as the
most space-efficient, but slower, prior designs
[<a href="http://www.cs.cmu.edu/~dongz/papers/poppy.pdf">SEA 2013</a>]</p>
<p><a href="https://github.com/efficient/rankselect">Reference implementation</a></p>
<h3><a name="cuckoofilter">Cuckoo Filter</a></h3>
<p>Data structure that can replace Bloom filters for approximate set membership tests, supporting
dynamic item addition and removal items while achieving even higher performance than Bloom filters
[<a href="https://dl.acm.org/doi/10.1145/2674005.2674994">CoNEXT 2014</a>]</p>
<p><a href="https://github.com/efficient/cuckoofilter">Reference implementation</a></p>
<h3><a name="libcuckoo">libcuckoo</a></h3>
<p>High-throughput and memory-efficient concurrent hash table that supports multiple readers and
writers [<a href="https://dl.acm.org/doi/10.1145/2592798.2592820">EuroSys 2014</a>]</p>
<p><a href="https://github.com/efficient/libcuckoo-sourcery">Version presented in paper</a></p>
<p><a href="https://github.com/efficient/libcuckoo">Updated implementation</a></p>
<h3><a name="surf">SuRF (Succinct Range Filter)</a></h3>
<p>Fast and compact data structure for approximate membership tests that supports both single-key
lookups and common range queries
[<a href="https://dl.acm.org/doi/abs/10.1145/3183713.3196931">SIGMOD 2018</a>]</p>
<p><a href="https://github.com/efficient/SuRF">SuRF data structure</a></p>
<p><a href="https://www.rangefilter.io">Live interactive demo</a>
(<a href="https://github.com/efficient/SuRF-demo">source code</a>)</p>
<p><a href="https://github.com/efficient/fast-succinct-trie">Underlying FST data structure</a></p>
<p><a href="https://github.com/efficient/rocksdb">RocksDB equipped with SuRF</a></p>
<h3><a name="hope">HOPE (High-speed Order-Preserving Encoder)</a></h3>
<p>Fast dictionary-based compressor that encodes arbitrary keys while preserving their order
[SIGMOD 2020]</p>
<p><a href="https://github.com/efficient/HOPE">Reference implementation</a></p>
<h2>Distributed consensus</h2>
<h3><a name="epaxos">EPaxos (Egalitarian Paxos)</a></h3>
<p>Distributed consensus algorithm based on Paxos that efficiently achieves optimal commit
latency in the wide-area when tolerating one and two failures, uniform load balancing across all
replicas, and graceful performance degradation when replicas are slow or crash
[<a href="https://dl.acm.org/doi/10.1145/2517349.2517350">SOSP 2013</a>]</p>
<p><a href="https://github.com/efficient/epaxos">Reference implementation</a></p>
<h3><a name="quorumlease">Paxos Quorum Leases</a></h3>
<p>Technique that allows Paxos-based systems to perform reads with high throughput and low latency
without sacrificing consistency or write latency or accepting more than minimal impact on system
availability
[<a href="https://www.pdl.cmu.edu/PDL-FTP/associated/CMU-PDL-14-105_abs.shtml">CMU PDL 2014</a>]</p>
<p><a href="https://github.com/efficient/qlease">Reference implementation</a></p>
<h2>Isolation</h2>
<h3><a name="catbench">CATBench</a></h3>
<p>Benchmark demonstrating that cache partitioning can protect the performance of latency-sensitive
networked applications from local resource contention
[<a href="http://reports-archive.adm.cs.cmu.edu/anon/2017/abstracts/17-125.html">CMU CSD 2017</a>]</p>
<p><a href="https://github.com/efficient/catbench">Benchmarks in paper</a></p>
<h3><a name="microservices">Microservice Microbenchmarks</a></h3>
<p>Benchmark demonstrating a likely explanation for contemporary serverless platforms' high
microservice invocation latencies in the hundreds of milliseconds for cold starts and tens of
milliseconds for warm starts
[<a href="https://www.usenix.org/conference/atc18/presentation/boucher">ATC 2018</a>]</p>
<p><a href="https://github.com/efficient/microservices_microbenchmarks">Benchmarks in paper</a></p>
<h3><a name="lpf">Lightweight Preemptible Functions</a></h3>
<p>Program primitives for limiting function execution time and isolating memory within a process</p>
<h4>Thesis</h4>
<a href="http://reports-archive.adm.cs.cmu.edu/anon/2022/abstracts/22-101.html">CMU CSD 2022</a>
<ul>
<li>Chapter 3: <a href="https://github.com/efficient/libinger/tree/gotcha">libgotcha</a></li>
<ul>
<li>Section 3.9.1: <tt>make bench</tt> (see <tt>bench.rs</tt>)</li>
<li>Section 3.9.2: <a href="https://github.com/efficient/dyno">dyno</a></li>
<li>Section 3.9.3: <a href="https://github.com/efficient/libas-safe/tree/gotchyas/tlsblock">libtlsblock</a> and <a href="https://github.com/efficient/libas-safe/tree/gotchyas/tls">benchmark</a></li>
</ul>
<li>Chapter 4: <a href="https://github.com/efficient/libas-safe">libas-safe</a> and <a href="https://github.com/efficient/libac-safe">libac-safe</a></li>
<li>Chapter 5: <a href="https://github.com/efficient/libinger">libinger</a></li>
<ul>
<li>Section 5.12: <tt>cargo build --release --bench inger</tt> (see <tt>benches/inger.rs</tt>)</li>
<li>Section 5.12.1: <a href="https://github.com/efficient/tokio/tree/libpng-bombs">benchmark</a></li>
<li>Section 5.6.1: <a href="https://github.com/efficient/microservices_microbenchmarks/tree/master/hasher">benchmark</a></li>
</ul>
<li>Chapter 6: <a href="https://github.com/efficient/libinger/tree/master/compiler">ingerc</a> and <a href="https://github.com/efficient/libac-safe/tree/rust">minimal</a></li>
<li>Chapter 7: <a href="https://github.com/efficient/tokio/tree/tokio-threadpool-preemptive">libturquoise</a> and <a href="https://github.com/efficient/tokio/tree/master">benchmark</a></li>
<li>Chapter 8: <a href="https://github.com/efficient/strobelight-rpc">strobelight</a></li>
<li>Chapter 9: <a href="https://github.com/efficient/microservices_microbenchmarks">benchmark</a></li>
</ul>
<h4>Conference paper</h4>
<p>Mechanism for synchronously performing a function call with a precise timeout that is
lightweight, efficient, and composable, all while being portable between programming languages
[<a href="https://www.usenix.org/conference/atc20/presentation/boucher">ATC 2020</a>]</p>
<p><a href="https://github.com/efficient/libinger">libinger implementation</a></p>
<p><a href="https://github.com/efficient/tokio">libturquoise (preemptive tokio-threadpool)</a></p>
<p><a href="https://github.com/efficient/libinger/tree/gotcha">libgotcha implementation</a></p>
<p><a href="https://github.com/efficient/libas-safe">libas-safe library</a></p>
<p><a href="https://github.com/efficient/libas-safe/tree/gotchyas/posix_safety">libas-safe example</a></p>
<p><a href="https://github.com/efficient/tokio/tree/master">hyper benchmark from paper</a></p>
<p><a href="https://github.com/efficient/tokio/tree/libpng-bombs">libpng benchmark from paper</a></p>
<h2>Key-value stores</h2>
<h3><a name="memc3">MemC3</a></h3>
<p>Set of architecturally and workload inspired algorithmic and engineering improvements to the
popular Memcached system that substantially improve both its memory efficiency and throughput
[<a href="https://www.usenix.org/conference/nsdi13/technical-sessions/presentation/fan">NSDI 2013</a>]</p>
<p><a href="https://github.com/efficient/memc3">MemC3 implementation</a></p>
<h3><a name="herd">HERD</a></h3>
<p>Key-value system designed to make the best use of an RDMA network by reducing network round trips
while using efficient RDMA primitives
[<a href="https://dl.acm.org/doi/10.1145/2619239.2626299">SIGCOMM 14</a>]</p>
<p><a href="https://github.com/efficient/HERD">HERD implementation</a></p>
<h3><a name="mica">MICA</a></h3>
<p>Scalable in-memory key-value store that achieves a record-setting throughput of 120 million
requests per second (MRPS) on a single commodity server, 9.2X the performance (RPS) and 2.8X the
system energy efficiency (RPS/watt) of the best-published FPGA-based claims
[<a href="https://www.usenix.org/conference/nsdi14/technical-sessions/presentation/lim">NSDI 2014</a>,
<a href="https://ieeexplore.ieee.org/document/7284088">ISCA 2015</a>]</p>
<p><a href="https://github.com/efficient/mica">Version presented in NSDI paper</a></p>
<p><a href="https://github.com/efficient/mica/tree/isca2015">Version presented in ISCA paper</a></p>
<p><a href="https://github.com/efficient/mica2">Updated implementation (MICA 2)</a></p>
<h3><a name="cicada">Cicada</a></h3>
<p>Single-node multi-core in-memory transactional database with serializability that provides
high performance under diverse workloads by leveraging optimistic and multi-version concurrency
control schemes and multiple loosely synchronized clocks
[<a href="https://dl.acm.org/doi/10.1145/3035918.3064015">SIGMOD 2017</a>]</p>
<p><a href="https://github.com/efficient/cicada-engine">Cicada implementation</a></p>
<p><a href="https://github.com/efficient/cicada-exp-sigmod2017">Benchmarks in paper</a></p>
<h2>Memory and storage</h2>
<h3><a name="memstores">Building Blocks for Main Memory Data Stores</a></h3>
<p>Systems design proposals for consistent, durable, and safe memory management for future
byte-addressable non-volatile (NV) memory
[<a href="https://www.pdl.cmu.edu/PDL-FTP/NVM/CMU-PDL-11-114_abs.shtml">CMU PDL 2011</a>]</p>
<p><a href="https://github.com/efficient/nvram">Reference implementations</a></p>
<h3><a name="logstructured">Evaluation of Multi-Stage Log-Structured Designs</a></h3>
<p>New analytic primitives and MSLS design models that quickly give accurate performance estimates
for evaluation of systems such as LevelDB, RocksDB, HBase, and Cassandra
[<a href="https://www.usenix.org/conference/fast16/technical-sessions/presentation/lim">FAST 2016</a>]</p>
<p><a href="https://github.com/efficient/msls-eval">Benchmarks in paper</a></p>
<h2>Networking</h2>
<h3><a name="cuckooswitch">CuckooSwitch</a></h3>
<p>Software-based Ethernet switch design built around a memory-efficient, high-performance, and
highly-concurrent hash table for compact and fast FIB lookup
[<a href="https://dl.acm.org/doi/10.1145/2535372.2535379">CoNEXT 2013</a>]</p>
<p><a href="https://github.com/efficient/cuckooswitch">CuckooSwitch implementation</a></p>
<h3><a name="gopt">G-Opt</a></h3>
<p>Exploration of a hypothesis that many of the benefits of using Graphics Processing Units (GPUs)
as accelerators for software-based routing and packet handling applications arise less from the GPU
hardware itself as from the expression of the problem in a language such as CUDA or OpenCL that
facilitates memory latency hiding and vectorization through massive concurrency
[<a href="https://www.usenix.org/conference/nsdi15/technical-sessions/presentation/kalia">NSDI 2015</a>]</p>
<p><a href="https://github.com/efficient/gopt">Benchmarks in paper</a></p>
<h3><a name="rdmabench">Design Guidelines for High Performance RDMA Systems</a></h3>
<p>Framework for understanding RDMA performance and helping system designers to navigate the RDMA
design space
[<a href="https://www.usenix.org/conference/atc16/technical-sessions/presentation/kalia">ATC 2016</a>]</p>
<p><a href="https://github.com/efficient/rdma_bench">Benchmarks in paper</a></p>
<h3><a name="fasst">FaSST (Fast, Scalable and Simple distributed Transactions)</a></h3>
<p>RDMA-based system that provides distributed in-memory transactions with serializability and
durability, yet eschews one-sided RDMA for fast RPCs using two-sided unreliable datagrams
[<a href="https://www.usenix.org/conference/osdi16/technical-sessions/presentation/kalia">OSDI 2016</a>]</p>
<p><a href="https://github.com/efficient/fasst">Benchmarks in paper</a></p>
</body>
</html>