RPC-Based File System

Introduction

In distributed computing systems, it is often necessary to make use of resources located on remote machines. We can write our distributed applications to use the network to communicate between different components running on different machines.

In this project, I will use remote procedure calls (RPCs) to provide access to remote services, but with an interface identical to local services by interposing RPC stubs in place of the C library functions that handle file operations.

The main purpose of this project is to design a file system RPC framework. Clients can send requests to do some operations on the server side, like open, close, read, write, lseek, stat, unlink, getdirentries, getdirtree and freedirtree. Besides, it also supports multiple clients concurrent send requests to the server. The server side uses fork operation to fulfill this requirement.

Serialization Protocol

In order to simplify things. For each operation, there are two structs are created. Let’s call them object_a_t and object_b_t. From the point of view of clients, they use object_a_t to send data and object_b_t to receive data from the server. From the point of view of the server, it uses object_a_t to receive data and unmarshall it, and marshall object_b_t to send the serialized data to clients.

Except for the basic data types, it's very common to see pointers in structs. Pointers are places to store the address in its memory of some data. If we simply send pointers to the server or clients send pointers to the servers, it's meaningless or even creates some disasters on the other side.

In order to achieve desired operations get successfully delivered, all the pointers are converted into their original struct or data type.

An example of one pair of structs

Clients Bear More Responsibilities

As I mentioned above, I miss one special case, which is variable-length arrays. It happens in the real world. Like the write operation. Each time different bytes may be sent from clients to the server.

There is a proper length that will be assigned to arrays beforehand. It really makes our life easier, because the server and clients all know about how many bytes they should expect to receive from the other side.

To send all the data of an array, multiple requests are required to deliver all the data. After this, the desired operation can be executed by the server.

Handling Conflicted File Descriptor Set

There may exist some conflict over file descriptors. For example, the client has its own file descriptor set and the server also has. In order to not mess up with different file descriptors, a mapping mechanism is adopted. Simply, just add a fixed offset to the server-side file descriptor. Then the client would be able to distinguish which file descriptor belongs to itself.

Github Repository for This Project: RPC-Based File System

Page updated

Google Sites

Report abuse