To accompany the text “Introduction to Parallel Computing”,
Addison Wesley, 2003.
The first step in developing a parallel algorithm is to decompose the problem into tasks that can be executed concurrently
A given problem may be decomposed into tasks in many different ways.
Tasks may be of same, different, or even interminate sizes.
A decomposition can be illustrated in the form of a directed graph with nodes corresponding to tasks and edges indicating that the result of one task is required for processing the next. Such a graph is called a task dependency graph.
Computation of each element of output vector y is independent of other
elements. Based on this, a dense matrix-vector product can be decomposed
into n tasks. The figure highlights the portion of the matrix and vector
accessed by Task 1.
Observations: While tasks share data (namely, the vector b), they
do not have any control dependencies – i.e., no task needs to
wait for the (partial) completion of any other. All tasks are of the
same size in terms of number of operations. Is this the maximum
number of tasks we could decompose this problem into?