Skip to content
Jason Gullickson edited this page Jun 22, 2014 · 4 revisions

Not a lot here yet, mostly a scratchpad for future features, etc.

Thoughts on federation

A distributed configuration of jsfs would consist of a local jsfs daemon running on the system that will be accessing the filesystem and listening on local ports. This is akin to a "network client" or "redirector", where the applications using the store access the distributed file store via calls through this local instance.

The local instance of jsfs maintains a list of other nodes sharing a filesystem. The jsfs metadata is shared among these hosts, replicated automatically. When the local node requests a file from the distributed filesystem, the local jsfs instance receives the request and looks up the hashblock(s) for the requested file. The local block store is then interrogated for the block files. If the files do not exist in the local block store, a request to each node in the list of nodes sharing the filesystem is made. These requests target a new (currently not-implemented) endpoint which only accepts requests for specific hashblocks and only returns the contents of these blocks. Note that a request is sent to all nodes asynchronously, as quickly as possible, and as soon as a "found" response is returned for the requested block from any node all other requests are cancelled. This ensures that the most responsive node handles the request without any negotiation overhead.

When a node with the requested hashblock(s) is found the blocks are transferred to the local host and stored. This serves as a cache for future requests from the local host as well as a source of blocks for other nodes using the distributed filesystem. An expiration system for these blocks is not described, since it's really part of a more global storage maintenance subsystem that has yet to be designed.

Thoughts on redundancy

Redundancy is necissary to protect the filesystem in the event of hardware failures, etc. Redundancy can be delegated to the underlying operating system, but there's a place for redundancy in the jsfs layer as well, allowing for the safety of redundant storage of jsfs objects without relying on external mechanisms.

A mirroring option, when engaged, could ensure that any hashblocks stored to disk are written to more than one place. This could be done a couple different ways.

One option is to leverage the federation mechanism above, requiring at least two nodes participating in a distributed filesystem. Then, if the "mirror" option is engaged, hashblocks written to disk are first written locally, then immediately written to a second node in the distributed filesystem.

A second option could be contained entirely within a single jsfs instance. In this arrangement, an array of storage locations could be configured, and each hashblock written could be written to each storage location in-turn. So long as these locations point to separate storage hardware (discs, NAS, etc.) redundancy is achieved. The read-side of jsfs would also need to consult this storage location list during read operations.

Clone this wiki locally