TDFS Publisher's description
from Ivan Voras
TDFS is a proof-of-concept implementation of distributed file system as a layer above normal file systems.
TDFS stands for "Trivially distributed file system", and is a proof-of-concept implementation of distributed file system as a ("stacked") layer above normal file systems. It uses the FUSE libraries and subsystem to implement this operation in userland.
NOTE:This is currently a proof-of-concept implementation, not ready for production use! I would appreciate any feedback about this project - if it works or if it doesn't work. Since I'm doing this in my spare time, it will take a long time for me to catch all bugs alone; if you need to speed this project up, consider posting ind the forums and/or submitting patches. You can contact me either personally or, better, via the SourceForge forums for the project.
The goal of TDFS is to solve single-writer-multiple-readers distribution of file system data (also called single-master-multiple-slave). In this scenario all writes happen (or originate) on one computer, and are propagated to others. Read requests can go to either the master or the slaves, and are served locally. Read requests don't go over the network, so this system doesn't offer strict synchronization. Some usages for this scenario are:
?пїЅ Hot backups - all data is immediately propagated to a backup machine
?пїЅ Archival - data is read-mostly and it helps to make it available on large number of machines
?пїЅ Load balancing - one machine generates the data (possibly from a database) but is not a web server. Other machines are web servers and serve their local copy of the data. This is also useful as a separation of privileges (DMZ-style).
For example: in a scenario with one master and two slaves, the data is stored three times, once on each machine.
System Requirements:?пїЅ FUSE libraries and kernel module
The TDFS system consists of two daemons, tdfs and tdfs_slave. The tdfs daemon runs on the master server and provides a mount point whose operations are mirrored over the network to tdfs_slave daemons. Among command-line arguments it supports these are the most important:
-m : Specify local directory that will be distributed. The local directory will be used for all read-only operations, and all write operations will be mirrored to slave daemons.
-c : Add a client (slave) host to the list of slaves. At least one slave must be specified, and slave daemons must be running before the master is started.
-z : Specify compression option to use. 0 (the default) means no compression and 1 means liblzf is used. The liblzf brings between 50% and 100% compression with very small overhead, so in theory enabling it could mean making a difference between a 100Mbit/s and 200Mbit/s operation (in practice network latency will absolutely kill the throughput in either case).
-h : Show help message for additional options
Note: there is an error in the `README` supplied with `tdfs-r1` release in which it's said that `-n` switch enables `TCP_NODELAY`. This is opposite of what the `-n` switch does in that version (TCP_NODELAY is now enabled by default and `-n` disables it).
Once the tdfs daemon is properly started, it will provide a device entry /dev/fuseX, where X is a small integer incremented every time a FUSE daemon is (re)started. When started for the first time, the device entry will be /dev/fuse0 and this is the value that will be used in examples. Note that old and inactive entries are not removed and will remain even after the tdfs daemon exits (all this is a peculiarity of FreeBSD and currently cannot be solved). This device entry must be used with mount_fusefs utility to mount it on a desired directory.
The tdfs_slave daemon is simpler to start, and the only really important arguments it accepts
Program Release Status: Minor Update
Program Install Support: Install and Uninstall