Server-local decisions based on cache knowledge

Remote processing results in a lack of knowledge on the client side regarding system state on the server side. Here we investigate the benefits of delegating some data access decisions to storage when the storage is aware of its recently accessed data. In this experiment, we show the execution time for a range query that has 10% selectivity on a 1 Billion row TPC-H Lineitem table and demonstrate the effects of server side blindly processing client requests versus allowing the server to process requests based on local knowledge. This setup assumes the data requests can be reordered, which is our case since we are reading all of the data partitions. The query is defined as:

SELECT * FROM lineitem1B where l_extendedprice > 71,000.

The experimental setup includes 1 client side machine and 1 server side machine, and the processing happens either on the client or server side. Client side processing indicates that the server sends the entire dataset over the network to the client which performs the scan and filtering. Server side processing indicates that Ceph performs the scan and filter on each object locally, then sends only the rows that pass the filter back to the client. Both are sensitive to the parallelization, which is set at 12 threads on the client (dispatch requests/process data) and 12 threads per server machine/Ceph OSD (process data).

The client and server machines have the same hardware, with 128GB of RAM, but our dataset is about 140GB and hence does not fit completely within RAM. The dataset is partitioned into 10,000 objects of several megabytes each, and all data is stored on the server machine's SSD. The network is 10GbE.

In the below figure, "Cold" is the baseline execution time of the query for both client and server side processing referring to cold cache on the server. "Hot" refers to hot cache, which is first warmed up by scanning and filtering all of the objects in a forward (fwd), random (rnd), or backward (bwd) order, where each object can be requested by its object ID. Then we issue a subsequent query in the similarly indicated direction and report the execution time. So, server-hot-fwd-bwd indicates server side processing with hot cache and forward warmup sequence then backward read sequence. Each experiment is run three times and error bars represent one standard deviation.

Locality still matters

The results show the baselines for client side (blue) and server side (red) are similar with server side about 12% faster. They also show the variance is higher for client side processing in general. The three server-side hot cache results show dramatically better performance, but since our dataset is larger than our cache, there are still some cache misses. The right-most result shows the best performance since it has the fewest cache misses assuming an LRU policy: the warmup is in forward direction then read is in backward direction.

This example shows that when processing remote data, although the database client application is unable to optimize requests, enabling the server to perform some optimizations based on local state info can be extremely beneficial. Even with remote processing in the cloud, locality still matters. This functionality has not yet been added but we are exploring it now.