Scalable Parallel implementation of Conjugate Gradient Dense Linear System solver library that is NUMA-aware and cache-aware

Scalable Parallel implementation of Conjugate Gradient Dense Linear System Solver library that is NUMA-aware and cache-aware, version 1.71

Author: Amine Moulay Ramdane

Description:

I have come up with a new algorithm of my Parallel Conjugate gradient solver library, now it has become cache-aware, but you have to notice that this new cache-aware algorithm is more efficient on multicores, since i have benchmarked it against my previous algorithm and it has given a scalability of 5X on a Quadcore over the single thread of my previous algorithm , that's a really a big improvement ! and now i have made it NUMA-aware so that the memory transfers from the memory to the CPU have become parallelized so that it becomes really scalable on NUMA architecture, and this is also a big improvement !.

Hope you will be happy with my new algorithm and parallel library, frankly i think i have to write something like a PhD paper to explain more my new algorithm , but i will let it as it is at this moment... perhaps i will do it in the near future.

This scalable Parallel library is especially designed for large scale

industrial engineering problems that you find on industrial Finite element problems and such, this scalable Parallel library was ported to FreePascal and all the Delphi XE versions and even to Delphi 7, hope you will find it really good.

Sparse linear system solvers are ubiquitous in high performance computing (HPC) and often are the most computational intensive parts in scientific computing codes. A few of the many applications relying on sparse linear solvers include fusion energy simulation, space weather simulation, climate modeling, and environmental modeling, and finite element method, and large-scale reservoir simulations to enhance oil recovery by the oil and gas industry.

Conjugate Gradient is known to converge to the exact solution in n steps for a matrix of size n, and was historically first seen as a direct method because of this. However, after a while people figured out that it works really well if you just stop the iteration much earlier - often you will get a very good approximation after much fewer than n steps. In fact, we can analyze how fast Conjugate gradient converges. The end result is that Conjugate gradient is used as an iterative method for large linear systems today.

The Parallel implementation of Conjugate Gradient Linear System Solver that i programmed here is designed to be used to solve large sparse systems of linear equations where the direct methods can exceed available machine memory and/or be extremely time-consuming. for example the direct method of the Gauss algorithm takes O(n^2) in the back substitution process and is dominated by the O(n^3) forward elimination process, that means, if for example an operation takes 10^-9 second and we have 1000 equations , the elimination process in the Gauss algorithm will takes 0.7 second, but if we have 10000 equations in the system , the elimination process in the Gauss algorithm will take 11 minutes !. This is why i have develloped for you the Parallel implementation of Conjugate Gradient Linear System Solver in Object Pascal, that is very fast.

You have only one method to use that is Solve()

function TParallelConjugateGradient.Solve(var A: arrarrext;var B,X:VECT;var RSQ:DOUBLE;nbr_iter:integer;show_iter:boolean):boolean;

The system: A*x = b

The important parameters in the Solve() method are:

A is the matrix , B is the b vector, X the initial vector x,

nbr_iter is the number of iterations that you want and show_iter to show the number of iteration on the screen.

RSQ is the sum of the squares of the components of the residual vector A.x - b.

The Conjugate Gradient Method is the most prominent iterative method for solving sparse systems of linear equations. Unfortunately, many textbook treatments of the topic are written with neither illustrations nor intuition, and their victims can be found to this day babbling senselessly in the corners of dusty libraries. For this reason, a deep, geometric understanding of the method has been reserved for the elite brilliant few who have painstakingly decoded the mumblings of their forebears. Conjugate gradient is the most popular iterative method for solving large systems of linear equations. CG is effective for systems of the form A.x = b where x is an unknown vector, b is a known vector, A is a known square, symmetric, positive-definite (or positive-indefinite) matrix. These systems arise in many important settings, such as finite difference and finite element methods for solving partial differential equations, structural analysis, circuit analysis, and math homework

The Conjugate gradient method can also be applied to non-linear problems, but with much less success since the non-linear functions have multiple minimums. The Conjugate gradient method will indeed find a minimum of such a nonlinear function, but it is in no way guaranteed to be a global minimum, or the minimum that is desired. But the conjugate gradient method is great iterative method for solving large, sparse linear systems with a symmetric, positive, definite matrix.

In the method of conjugate gradients the residuals are not used as search directions, as in the steepest decent method, cause searching can require a large number of iterations as the residuals zig zag towards the minimum value for ill-conditioned matrices. But instead conjugate gradient method uses the residuals as a basis to form conjugate search directions . In this manner, the conjugated gradients (residuals) form a basis of search directions to minimize the quadratic function f(x)=1/2*Transpose(x)*A*x + Transpose(b)*x and to achieve faster speed and result of dim(N) convergence.

Language: FPC Pascal v2.2.0+ / Delphi 7+: http://www.freepascal.org/

Operating Systems: Windows

Minimum supported client: Windows Vista [desktop apps only]

Minimum supported server: Windows Server 2008 [desktop apps only]

Required FPC switches: -O3 -Sd -dFPC -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -$H+ -DDelphi

For Delphi XE-XE7 use the -DXE switch

{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems

{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems

Please click on the small arrow on the right of the zip file bellow to download...