Software‎ > ‎

JPlasma

Description

JPlasma is a Java port of PLASMA (Parallel Linear Algebra for Scalable Multi-core Architectures).

The Parallel Linear Algebra for Scalable Multi-core Architectures (PLASMA) project aims to address the critical and highly disruptive situation that is facing the Linear Algebra and High Performance Computing community due to the introduction of multi-core architectures. PLASMA’s ultimate goal is to create software frameworks that enable programmers to simplify the process of developing applications that can achieve both high performance and portability across a range of new architectures. The development of programming models that enforce asynchronous, out of order scheduling of operations is the concept used as the basis for the definition of a scalable yet highly efficient software framework for Computational Linear Algebra applications.

It is difficult to overestimate the magnitude of the discontinuity that the high performance computing (HPC) community is about to experience because of the emergence of next generation of multi-core and heterogeneous processor designs. For at least two decades, HPC programmers have taken it for granted that each successive generation of microprocessors would, either immediately or after minor adjustments, make their old software run substantially faster. But three main factors are converging to bring this "free ride" to an end.

First, system builders have encountered intractable physical barriers - too much heat, too much power consumption, and too much leaking voltage - to further increases in clock speeds. Second, physical limits on the number and bandwidth of pins on a single chip means that the gap between processor performance and memory performance, which was already bad, will get increasingly worse. Finally, the design trade-offs being made to address the previous two factors will render commodity processors, absent any further augmentation, inadequate for the purposes of tera- and petascale systems for advanced applications.

This daunting combination of obstacles has forced the designers of new multi-core and hybrid systems, searching for more computing power, to explore architectures that software built on the old model are unable to effectively exploit without radical modification. Currently available Linear Algebra software packages rely on parallel implementations of the Basic Linear Algebra Subroutines (BLAS) to take advantage of multiple execution units. This solution is characterized by a fork-join model of parallel execution, which may result in suboptimal performance on current and future generations of multi-core processors since it introduces strict dependencies due to the presence of non parallelizable portions of code. The PLASMA project aims to overcome the shortcomings of this approach by introducing a pipelined model of parallel execution. (source: http://icl.cs.utk.edu/plasma/index.html)

Features

Currently, only a small subset of LAPACK routines is available:
  • QR factorization.
  • Solving a system of linear equations using QR factorization.
  • LU factorization.
  • Solving a system of linear equations using LU factorization.
  • Cholesky factorization.
  • Solving a system of linear equations using Cholesky factorization.

Benchmark

JPlasma 1.0 was benchmarked against PLASMA 1.0 and JLAPACK 0.8. PLASMA was used in conjunction with ATLAS and the huge pages were disabled (JPlasma does not support huge pages). The timings in the tables below are an average among 10 calls of each routine with randomly generated data. They do not incorporate the "warm up phase" (first two calls require more time) for Java code.

Testbed:

  • 2 x Quad-Core Intel Xeon X5472 (3GHz, 12MB L2 Cache),
  • 32 GB RAM,
  • Ubuntu 8.10 (64-bit),
  • GCC 4.3.3,
  • GNU Fortran 4.3.3,
  • Sun Java 1.6.0_14 (64-Bit Server VM),
  • Java flags: -d64 -server -Xms2g -Xmx2g -XX:+UseParallelGC
  • C/Fortran compiler flags: -O2

Library 1 thread
2 threads
4 threads
8 threads
PLASMA
4439
2301
1267
751
JPlasma
17523
8922 4788 2565
JLAPACK
27221 -
-
-

Average execution time (in milliseconds) for solving a system of linear equations 
(size: 3000 x 2000, 20 right hand sides) using QR factorization (DGELS).


Library 1 thread
2 threads
4 threads
8 threads
PLASMA
1274
692
381
225
JPlasma
7519
3927
2105
1176
JLAPACK
6051 -
-
-

Average execution time (in milliseconds) for solving a system of linear equations 
(size: 2000 x 2000, 20 right hand sides) using LU factorization (DGESV).


Library 1 thread
2 threads
4 threads
8 threads
PLASMA
495
289
170
120
JPlasma
2261
1162
552
351
JLAPACK
2296 -
-
-

Average execution time (in milliseconds) for solving a system of linear equations 
(size: 2000 x 2000, 20 right hand sides) using Cholesky factorization (DPOSV).

License

-- Innovative Computing Laboratory
-- Electrical Engineering and Computer Science Department
-- University of Tennessee
-- (C) Copyright 2008

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

* Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
* Neither the name of the University of Tennessee, Knoxville nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Download

The source code distribution, besides Ant build file, contains also Eclipse project files.

version 1.2 (March 20, 2010) Changelog

binary: 

doc: 

source: 

Comments