Oddylos
On-demand dynamic load scheduler for hybrid computing systems
J.I. Agulleiro (1), J.J. Fernández (2)
(1) Associated unit CSIC+UAL. Univ. Almeria. 04120 Almeria. Spain.
(2) Spanish National Research Council (CSIC). Oviedo. Spain.
Contact: jjfernandez.software @ gmail.com
Hybrid CPU+GPU approach implemented in Oddylos. The system keeps a pool of tasks to do. A number of threads to be mapped to CPU cores (denoted by C-threads) are running concurrently in the system. Also, specific threads (denoted by G-threads) in charge of the tasks to be computed on the GPUs are also running. The tasks are asynchronously dispatched to the threads on-demand. In the figure, allocation of tasks to threads are color-coded. Note that the G-threads will request tasks more often than the C-threads as GPUs make the calculations faster than a single CPU core. Moreover, faster GPUs will be assigned work more frequently than modest GPUs.
Description
Oddylos is a dynamic workload scheduler especially focused on hybrid computing systems. It is based on an 'on-demand' strategy whereby the different computing elements asynchronously request a new piece of work to do when idle (see figure above). Modern computers are inherently heterogeneous as they are equipped with CPU cores and GPUs (possibly including multiple GPUs with different computing capabilities). In these systems, proper distribution of the workload among the various processing elements is an issue, if maximum exploitation is intended. Oddylos turns out to be effective to orchestrate the workload and collaboratively combine the CPUs and the GPUs, thereby allowing full exploitation of the whole computing power available in modern computers.
Oddylos has been designed for stand-alone computers. Its great advantage is that you can connect to it a set of homogeneous or heterogeneous devices without making a distinction; Oddylos will adapt to it and will distribute the tasks according to the speed of the devices. In addition, the hybrid approach implemented here can be easily combined with MPI, other parallel/distributed computing libraries, or shell programming (e.g. as in IMOD or Priism/IVE) to implement a hierarchical hybrid system based on distributed nodes, which takes advantage of the multiple CPU cores and multiple GPUs within the nodes.
Oddylos has been successfully used for tomographic reconstruction in homogeneous (Tomo3D) and hybrid (Tomo3Dhybrid) systems. Here, the source code is provided with the hope it will be useful for you to create your own dynamic schedulers and implement hybrid computing approaches for your own problems.
A detailed description of the procedure implemented in Oddylos can be found in the following articles:
Hybrid computing: CPU+GPU co-processing and its application to tomographic reconstruction.
J.I. Agulleiro, F. Vazquez, E.M. Garzon, J.J. Fernandez.
Ultramicroscopy 115:109-114, 2012.
Fast tomographic reconstruction on multicore computers.
J.I. Agulleiro, J.J. Fernandez.
Bioinformatics 27:582-583, 2011.
Please, cite these articles if you use Oddylos in your works.
Download
Current version: February 2012
Available material:
Documentation in PDF: oddylos.pdf
Source code (in C)