-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
lucarin91
committed
Apr 15, 2016
1 parent
e0780d5
commit efcb725
Showing
8 changed files
with
76 additions
and
75 deletions.
There are no files selected for viewing
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,11 +1,11 @@ | ||
# Introduction | ||
The aim of the project is to create a weakly consist distributed file-system by the use of gossiping, consistent hashing and vector clocks. | ||
The aim of the project is to create a weakly consistent distributed file system by using gossiping, consistent hashing and vector clocks. | ||
|
||
The communication between node exploit the Java socket mechanism so it can be execute in different ways: on a single machine with threads, on a cluster of servers or in virtual containers using *docker*. | ||
The communication between nodes exploits the Java socket mechanism, thus it can be executed in different ways: on a single machine with threads, on a cluster of servers or in virtual containers using *Docker*. | ||
|
||
The file-system is implemented as a map with a string key and a number or string value with the following operations: | ||
The file system is implemented as a key value map of type $\langle string, string \cup number \rangle$ with the following operations: | ||
|
||
- **add**(key, value), add only if the key is not present. | ||
- **get**(key), get the value of the key if present. | ||
- **update**(key, value), update the key with the new value only if the key already exists. | ||
- **remove**(key), remove the key if present. | ||
- **add**(key, value) that adds the pair only if the key is not present; | ||
- **get**(key) returning the value of the key if present; | ||
- **update**(key, value) that updates the key with the new value only if the key already exists; | ||
- **remove**(key) that removes the key if present; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,38 +1,38 @@ | ||
# Logical Structure | ||
|
||
![Project logical structure](./img/pad-logic.png) | ||
|
||
The file-system is composed by two fundamental parts: | ||
The file system is composed by two fundamental parts: | ||
|
||
- the front-end, that provides the external access to the file system through a Restful JSON API; | ||
- the storage system itself where the data are stored and managed. | ||
|
||
- the front-end, that provides the external access to the file-system through a Restful json API, | ||
- the storage system its self where the data are stored and manage. | ||
The former component does not keep any track of the data stored in the system. It only knows the servers, thanks to the gossiping protocol. | ||
|
||
The first part doesn't have any information of the data store in the system. It only knows the servers, thanks to the gossiping protocol. | ||
|
||
## Communication system | ||
All the internal communication are done with the UDP transport protocol, to avoid the overhead of the TCP and assuming a reliable network between the servers. | ||
All the internal communication relies upon the UDP transport protocol to avoid the overhead of the TCP and assuming a reliable network between the servers exists. | ||
|
||
All the nodes, either the front-end and the storage one, use the gossip protocol to update the list of the servers involved in the file-system. | ||
So that each node has two services running on different ports, one for the gossip protocol and one for receiving messages from the other nodes. | ||
All the nodes, either the front-end and the storage one, exploit gossiping to update the list of the servers involved in the file system. | ||
According to that, each node runs two services on different ports: one in charge of the gossiping protocol, the other for receiving messages from the other nodes. | ||
|
||
When a new request arrives to a front node it is sent to a random storage node from its list and than it wait 5 second for an acknowledgement that the request is correctly served or an error, otherwise it assume that something goes wrong. | ||
When a new request arrives to a front-end node it is sent to a random storage node from its list and then it waits 5 seconds for an acknowledgement that the request has been correctly served or not. If no message is received, it is assumed that something went wrong. | ||
|
||
## Storage protocol | ||
All the storage nodes use consistent hashing to assign a key value to a given server with the following strategy: | ||
All the storage nodes use consistent hashing to assign a key-value pair to a given server with the following strategy: | ||
|
||
- a server is master for all the keys with lower or equal hash value. | ||
- each key is replicated to a fixed number of next server in the consistent hash. | ||
- a server is master for all the keys with lower or equal hash value; | ||
- each key is replicated to a fixed number of subsequence servers in the consistent hash ring; | ||
|
||
The system use a single master storage protocol without consensus, so the value is written or read without waiting for an acknowledgement from the backup's servers. | ||
The system use a single master storage protocol without consensus, so that the value is written or read without waiting for an acknowledgement from the backup servers. | ||
|
||
Each time a new server turn on it immediately became master for the keys with a lower hash and a backup server for the keys owned by the previous servers. So after their neighbors discovered it, they either send the keys that it has to manage or the keys that it has to keep for backups. | ||
Each time a new server turns on it immediately become master for the keys with a lower hash and the backup server for the keys owned by the previous server. So after its neighbors discovered it, they either send the keys that it has to manage or the keys that it has to keep for backup. | ||
|
||
Within the data it is also added a vector clock to keep trace of with server update the value. The vector clock is implemented using a map where the key is the server id and for the value a counter, in this way all the serves that don't have a key are considered zero. | ||
Within the data it is also added a vector clock to keep track of which server updated the value. The vector clock is implemented using a map where the key is the server id and the value is a counter; the servers that are not present in the map are considered with a 0 counter. | ||
|
||
So each time a server update a value as a master it increment the counter with its id inside the object, and foreword this new vector with the key value to its backups server. | ||
So each time a server updates a value as a master it increments the counter with its id inside the object, and forwards this new vector with the key-value to its backup servers. | ||
|
||
This vector clock is used every time two version of a value are founded, after some key management, to decide with is the newer. If two unconfrontable version of the value are founded the node server create value with the two different version and put the `conflict` flag to true. | ||
This vector clock is used every time two version of a value must be compared to decide which is the most recent. If two uncomparable versions of a value are found, the server node creates a value with the two different versions and sets the `conflict` flag. | ||
|
||
At this point where a user try to get that key, it receive all the conflict version and it can decide with one it consider the correct newer version by done an update operation. | ||
At this point when a user attempts to get a key, he/she receives all the conflicting versions and can decide which one is the correct version by performing an update operation. | ||
|
||
After the update the server resolve the conflict and merge all the vector clock for of all the value, in this way if at same time one of this old values are founded it will be discarded. | ||
After the update the server resolves the conflict and merges all the vector clocks together. For so, subsequent incoming old values for a key will be discarded. |
Oops, something went wrong.