Application into Parallel Search Structures

Parallel programming approach to data structure:

Process independently access data structures via concurrency control (locks)
- Non-blocking binary search tree - cost depends on number of concurrent processes and not necessarily balanced
- CBTree - concurrent splay tree, performs very well empirically but does not provide same guarantees of normal splay tree
Processes use data structures that are designed to run operations in parallel
- Parallel 2-3 tree: can run p operations on p processors in O( logn + logp ) time

BUT:

Difficult to use - programmer must know about the structure very well in order to coordinate access (or use concurrency control)

Easier to let the programmer write programs normally and use a parallel black box data structure

Implicit Batching:

Programmer writes a parallel program with calls to a parallel data structure
Runtime environment understands and connects the calls to the structure and the structure itself
Scheduler batches operations implicitly, and schedules accordingly

Kunal Agrawal, Seth Gilbert, Wei Quan Lim in 2018 designed such a black box parallel structure supporting this:

Based on Iacono's Working Set Structure
First parallel structure that provides worst-case bounds that adapt to the access sequence
Takes work nearly proportional to the working set bound
Uses a program DAG model
- vertices are calls to the data structure
- edges between nodes when it requires a previous call
BUT requires a weak priority or greedy scheduler, which are often not used in practice

p is the number of processes
eL is the number of small-ops, defined as operations in L that are performed on the map when its size is less than n
- can usually ignore eL*logp term
d is the number of calls to the data structure
T1 is total number of calls (nodes in DAG), T_infinity is longest path in DAG

Parallelism is within O [ (logp)^2 +logn ] of the optimal

Uses batched search structure instead of a tree
If element is in last tree and many threads access at once, will take b*logn time when it should take logn+b
- Combine duplicate accesses in batching

Next: About, References, Extensions

Google Sites

Report abuse