-
Notifications
You must be signed in to change notification settings - Fork 0
/
a4.html
354 lines (316 loc) · 13.2 KB
/
a4.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xml:lang="en" lang="en">
<head>
<title>CS-240 Assignment 4: Key-Value Storage Service</title>
<link rel="stylesheet" type="text/css" href="style.css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />
</script>
</head>
<body>
<h1>CS-240 Assignment 4: Key-Value Storage Service</h1>
<div id="topnavbar">
<ul class="topnavlist">
<li><a href="index.html">CS-240 Home</a></li>
<li><a href="syllabus.html">Syllabus</a></li>
<li><a href="assignments.html">Assignments</a></li>
<li><a href="announcements.html">Announcements</a></li>
<li><a href="https://piazza.com/kaust.edu.sa/fall2017/cs240/home">Q & A</a></li>
</ul>
</div>
<h2>Introduction</h2>
<p>
In this assignment you will build a fault-tolerant key-value storage
service using your Raft library from the previous assignments.
Your key-value service will be structured as a replicated state machine
with several key-value servers that coordinate their activities
through the Raft log. Your key-value service should continue to
process client requests as long as a majority of the servers
are alive and can communicate, in spite of other failures or
network partitions.
</p>
<p>
Your system will consist of clients and key-value servers,
where each key-value server also acts as a Raft node. Clients
send <tt>Put()</tt>, <tt>Append()</tt>, and <tt>Get()</tt> RPCs
to key-value servers (called kvraft servers), who then place
those calls into the Raft log and execute them in order. A
client can send an RPC to any of the kvraft servers, but if that
server is not currently a Raft leader, or if there's a failure, the
client should retry by sending to a different server. If the
operation is committed to the Raft log (and hence applied to
the key-value state machine), its result is reported to the
client. If the operation failed to commit (for example, if the
leader was replaced), the server reports an error, and the
client retries with a different server.
</p>
<h2>Software</h2>
<p>
You can download a tarball containing the provided code for assignment 4 <a href="assignments/assignment4.tar.gz">here</a>.
For this assignment, we have supplied you with skeleton code and
tests at <tt>src/kvraft</tt>. You will need to modify
<tt>kvraft/client.go</tt>, <tt>kvraft/server.go</tt>, and perhaps <tt>kvraft/common.go</tt>.
(Even if you don't modify <tt>common.go</tt>, you should submit it as-provided.) You will also need to replace the current <tt>raft.go</tt> with your own <tt>raft.go</tt> from assignment 3.
</p>
<p>
To get up and running, execute the following commands, as in the previous assignments, and change into the <tt>src/kvraft</tt> directory:
<pre>
$ cd cs240 # or wherever you unpacked your tarball
$ export GOPATH="$PWD"
$ cd "$GOPATH/src/kvraft"
$</pre>
</p>
<h2>Part I</h2>
<p>
The service supports three RPCs: <tt>Put(key, value)</tt>,
<tt>Append(key, arg)</tt>, and <tt>Get(key)</tt>. It maintains
a simple database of key-value pairs. <tt>Put()</tt> replaces
the value for a particular key in the database, <tt>Append(key,
arg)</tt> appends arg to key's value, and <tt>Get()</tt>
fetches the current value for a key. An <tt>Append</tt> to
a non-existant key should act like <tt>Put</tt>.
</p>
<p>
You will implement the service as a replicated state machine
consisting of several kvservers. Your kvraft client code
(<tt>Clerk</tt> in <tt>src/kvraft/client.go</tt>) should try
different kvservers it knows about until one responds
positively. As long as a client can contact a kvraft server
that is a Raft leader in a majority partition, its operations
should eventually succeed.
</p>
<p>
Your kvraft servers should not directly communicate; they
should only interact with each other through the Raft log.
</p>
<p>
Your first task is to implement a solution that works
when there are no dropped messages, and no failed
servers. Note that your service must provide
<em>sequential consistency</em> to applications that
use its client interface. That is, completed
application calls to the <tt>Clerk.Get()</tt>,
<tt>Clerk.Put()</tt>, and <tt>Clerk.Append()</tt>
methods in <tt>kvraft/client.go</tt> must appear to
have affected all kvservers in the same order, and have
at-most-once semantics. A <tt>Clerk.Get(key)</tt>
should see the value written by the most recent
<tt>Clerk.Put(key, …)</tt> or
<tt>Clerk.Append(key, …)</tt> (in the total
order).
</p>
<p>
A reasonable plan of attack may be to first fill in the
<tt>Op</tt> struct in <tt>server.go</tt> with the
"value" information that kvraft will use Raft to agree
on (remember that <tt>Op</tt>
field names must start with capital letters, since they
will be sent through RPC), and then implement the
<tt>PutAppend()</tt> and <tt>Get()</tt> handlers in
<tt>server.go</tt>. The handlers should enter an
<tt>Op</tt> in the Raft log using <tt>Start()</tt>, and
should reply to the client when that log entry is committed.
Note that you <strong>cannot</strong> execute
an operation until the point at which it is committed in
the log (i.e., when it arrives on the Raft
<tt>applyCh</tt>).
</p>
<p>
After calling <tt>Start()</tt>, your kvraft
servers will need to wait for Raft to complete
agreement. Commands that have been agreed upon arrive
on the <tt>applyCh</tt>. You should think carefully
about how to arrange your code so that your code will
keep reading <tt>applyCh</tt>, while
<tt>PutAppend()</tt> and <tt>Get()</tt> handlers submit
commands to the Raft log using <tt>Start()</tt>. It is
easy to achieve deadlock between the kvserver and its
Raft library.
</p>
<p>
Your solution needs to handle the case in
which a leader has called Start() for a client
RPC, but loses its leadership before the
request is committed to the log. In this case
you should arrange for the client to re-send
the request to other servers until it finds
the new leader. One way to do this is for the
server to detect that it has lost leadership,
by noticing that a different request has
appeared at the index returned by Start(), or
that the term reported by Raft.GetState() has
changed.
If the ex-leader is partitioned by
itself, it won't know about new leaders; but
any client in the same partition won't be able
to talk to a new leader either, so it's OK in
this case for the server and client to wait
indefinitely until the partition heals. More
generally, a kvraft server should not complete
a <tt>Get()</tt> RPC if it is not part of a majority.
</p>
<p>
You have completed Part I when you
<strong>reliably</strong> pass the first test in the
test suite: "One client". You may also find that you
can pass the "concurrent clients" test, depending on
how sophisticated your implementation is.
From the <tt>src/kvraft</tt> directory:
<pre>
$ go test -v -run Basic
=== RUN TestBasic
Test: One client ...
... Passed
--- PASS: TestBasic (15.18s)
PASS
ok kvraft 15.190s</pre>
</p>
<h2>Part II</h2>
<p>
In the face of unreliable connections and node failures, your
clients may send RPCs multiple times until it finds a kvraft
server that replies positively. One consequence of this is that
you must ensure that each application call to
<tt>Clerk.Put()</tt> or <tt>Clerk.Append()</tt> must appear in
that order just once (i.e., write the key-value database just
once).
</p>
<p>
Thus, your task in Part II is to cope with duplicate client requests, including
situations where the client sends a request to a kvraft leader
in one term, times out waiting for a reply, and re-sends the
request to a new leader in another term. The client request
should always execute just once.
</p>
<p>
You will need to uniquely identify client operations to
ensure that they execute just once. You can assume that
each clerk has only one outstanding <tt>Put</tt>,
<tt>Get</tt>, or <tt>Append</tt>.
</p>
<p>
For stability, you must make sure that your scheme for duplicate
detection frees server memory quickly, for example by
having the client tell the servers which RPCs it has
heard a reply for. It's OK to piggyback this
information on the next client request.
</p>
<p>
You have completed Part II when you
<strong>reliably</strong> pass all tests through
<tt>TestPersistPartitionUnreliable()</tt>.
<pre>
$ go test -v
=== RUN TestBasic
Test: One client ...
... Passed
--- PASS: TestBasic (15.22s)
=== RUN TestConcurrent
Test: concurrent clients ...
... Passed
--- PASS: TestConcurrent (15.83s)
=== RUN TestUnreliable
Test: unreliable ...
... Passed
--- PASS: TestUnreliable (16.68s)
=== RUN TestUnreliableOneKey
Test: Concurrent Append to same key, unreliable ...
... Passed
--- PASS: TestUnreliableOneKey (1.40s)
=== RUN TestOnePartition
Test: Progress in majority ...
... Passed
Test: No progress in minority ...
... Passed
Test: Completion after heal ...
... Passed
--- PASS: TestOnePartition (2.54s)
=== RUN TestManyPartitionsOneClient
Test: many partitions ...
... Passed
--- PASS: TestManyPartitionsOneClient (24.08s)
=== RUN TestManyPartitionsManyClients
Test: many partitions, many clients ...
... Passed
--- PASS: TestManyPartitionsManyClients (26.12s)
=== RUN TestPersistOneClient
Test: persistence with one client ...
... Passed
--- PASS: TestPersistOneClient (18.68s)
=== RUN TestPersistConcurrent
Test: persistence with concurrent clients ...
... Passed
--- PASS: TestPersistConcurrent (19.34s)
=== RUN TestPersistConcurrentUnreliable
Test: persistence with concurrent clients, unreliable ...
... Passed
--- PASS: TestPersistConcurrentUnreliable (20.37s)
=== RUN TestPersistPartition
Test: persistence with concurrent clients and repartitioning servers...
... Passed
--- PASS: TestPersistPartition (26.91s)
=== RUN TestPersistPartitionUnreliable
Test: persistence with concurrent clients and repartitioning servers, unreliable...
... Passed
--- PASS: TestPersistPartitionUnreliable (26.89s)
PASS
ok kvraft 214.069s</pre>
</p>
<h2>Resources and Advice</h2>
<p>
This assignment doesn't require you to write much code, but
you will most likely spend a substantial amount of time
thinking and staring at debugging logs to figure out
why your implementation doesn't work. Debugging will be
more challenging than in the Raft project because there are
more components that work asynchronously of each other.
Start early!
</p>
<p>
You should implement the service without worrying about the
Raft log's growing without bound. You do not need to implement
snapshots (from Section 7 in the paper) to allow garbage collection
of old log entries.
</p>
<p>
As noted, a kvraft server should not complete a <tt>Get()</tt>
RPC if it is not part of a majority (so that it does
not serve stale data). A simple solution is to enter
every <tt>Get()</tt> (as well as each <tt>Put()</tt>
and <tt>Append()</tt>) in the Raft log. You don't have
to implement the optimization for read-only operations
that is described in Section 8.
</p>
<p>
In Part I, you should probably modify your client
Clerk to remember which server turned out to
be the leader for the last RPC, and send the
next RPC to that server first. This will avoid
wasting time searching for the leader on every
RPC.
</p>
<h2>Submission</h2>
<p>
Submit your code via blackboard <a href="https://blackboard.kaust.edu.sa">CS240 Assignment 4</a>. We expect your submission to only contain these files: <tt>kvraft/client.go</tt>, <tt>kvraft/server.go</tt>, and <tt>kvraft/common.go</tt>.
You may submit multiple times, only the one in blackboard at the time of grading will be recorded.
</p>
<p>
Before submitting, please run the full tests given above for each part one final time. <b>You</b> are responsible for making sure your code works.
</p>
<p>
You will receive full credit for Part I if your software passes the tests mentioned for that section on our servers.
You will receive full credit for Part II if your software passes the tests mentioned for that section on our servers.
</p>
<p>
The final portion of your credit is determined by code quality tests, using the standard tools <tt>gofmt</tt> and <tt>go vet</tt>.
You will receive full credit for this portion if all files submitted conform to the style standards set by <tt>gofmt</tt> and the report from <tt>go vet</tt> is clean for your mapreduce package (that is, produces no errors).
If your code does not pass the <tt>gofmt</tt> test, you should reformat your code using the tool. You can also use the <a href="https://github.com/qiniu/checkstyle">Go Checkstyle</a> tool for advice to improve your code's style, if applicable. Additionally, though not part of the graded cheks, it would also be advisable to produce code that complies with <a href="https://github.com/golang/lint">Golint</a> where possible.
</p>
<h2>Acknowledgements</h2>
<p>This assignment is adapted from Princeton's COS-418, and previously from MIT's 6.824 course. Thanks to Mike Freedman, Frans Kaashoek, Kyle Jamieson, Robert Morris, and Nickolai Zeldovich for their support.</p>
<hr />
<p class="lastupdated">Last updated: 2017-11-08 16:51:55</p>
</body>
</html>