Caching is a great technique for improving the performance of a distributed system. It can help reduce data transfers, and improve the latency of operations. Therefore, this project is not just limited to remote file operations, and now it also performs caching. This project used a binary tool and interpose on corresponding C library calls. However, instead of connecting directly to a server, this library connects to a caching proxy, while the proxy, in turn, connects to the server.
The proxy handles the RPC file operations from the client interposition library. It fetches whole files from the server, and caches them locally. This system uses the protocol that offers open-close session semantics on whole files between the proxy and the server.
The Architecture
Caching Protocol
Implement a check-on-use cache coherence protocol.
Robustly handle multiple concurrent clients.
Ensure open-close session semantics on concurrent file access.
Uses LRU to manage a fixed-size cache containing variable-sized files.
Distributed System
Use whole file caching.
Use Java RMI, Java threading, and concurrency management techniques.
Emulate C file operations on locally-cached files
System Calls
open, close, read, write, lseek, and unlink.
After introducing architecture, we should also know what the system looks like in a real world scenerio.
The Distributed System
Clients usually send their requests to a proxy that is geographically close to them, which can reduce the latency. Besides, it also reduces the pressure of receiving a lot of requests at the same time. From the perspective of users, seems like they send requests directly to the server because the whole system maintains one copy semantic. The trickiest part is all about the mechanism to operate these files and keep track of their version numbers.
Each time a client sends a request to read or write some files, the proxy needs to check if the cache has this file and are these files the newest version. If not, the proxy would try to get the files from the server. After the client finishes its operation, the proxy would update the file on the server.
There are two types of concurrency we can explore on two levels. Two levels mean clients working on the same proxy and proxies working on the server. Meanwhile, two types mean working on the same file or different files.
Different Concurrency
Apparently, if different clients or different proxies working on different files is totally fine, because each file doesn't interfere with each other, and it is very safe in this way.
However, what about working on the same file? Well, things start getting interesting now. The normal way would be the reader-writer model to handle this issue. Therefore, the maximum concurrency that can be achieved is one file can only be written by one client in the whole system and multiple reads to the same file.
Most of cases, this method is not acceptable in the real world. In order to further improve the concurrency, compromising accuracy is usually normal. Therefore, after the proxy gets the file from the server, we can achieve the same file to be modified on different proxies, that is a big leap!
Well, it is still not enough. Here is our magic coming: versioning. Each time before doing anything, just copy the original file. This trick works both on proxies and the server. For example, if the client wants to do some operations on a file, it just copies it, then does this operation on the temporary file. After finishing, then overwrite the original file. What if a new request is coming, and the server has a newer version? The older operations still work on its older copy, while the new operations work on a newer copy.
The cost of coping is still there. In order to improve the performance further, all the reads would work on the same copy if there is no newer version, while each write has its own copy.
Clients to one proxy and proxies to the server can use the same scheme. Hence, one solution handles two problems.
What if update one file from the server? First, download it as a temporary file, then overwrite the original file.
There are some rules we should follow.
Copy, unlink, and rename should use mutex because they all operate on the original file.
When you are copying the file and someone else tries to remove it, something bad can happen. Therefore, for these situations always use mutex.
If there is a new version, then read the new version.
e.g., for proxy 1 -> (A, 3); proxy 2 -> (A, 4), rename; Proxy 3 -> (A, 4)
After proxy 1 finishes its read, it finds out, it’s an old version, then ref count = 0, remove it. When Proxy 3 starts to read, it should read a newer version(Version 4).
Scenario 2
If there is a new reading over the same version of the file, add its reference count.
After finishing writing, the file is an old version, then remove the current file.
However, if it’s the new version after finishing writing, just keep it.
Scenario 3
When one client operates on its own files, it should feel like no others here. Also, the same thing applies to proxies. Therefore, it provides a local file descriptor set for each client, while the proxy has a global file descriptor to differentiate them.
Github Repository for This Project: Distributed File System