Threadpool engine with priorities

Efficient Threadpool with priorities version 3.07

Author: Amine Moulay Ramdane

Description:

Efficient Thread Pool Engine.

The following have been added:

- You can give the following priorities to jobs:

LOW_PRIORITY

NORMAL_PRIORITY

HIGH_PRIORITY

- Uses a fast concurrent FIFO queue that satisfies many requirements: it is FIFO fair, it minimizes efficiently the cache-coherence traffic and it is energy efficient on the pop(): when there is no items in the queue it will not spin-wait , but it will wait on a portable manual event object..

- Enters in a wait state when there is no job in the queue - for more efficiency -

- You can distribute your jobs to the workers threads and call any method with the threadpool's execute() method.

- You can wait for the jobs to finish with the wait() method.

- Uses O(1) complexity on enqueue and O(3) worst case complexity on dequeue.

- Now it can use processor groups on windows, so that it can use more than 64 logical processors.

- It's NUMA-aware and NUMA efficient.

I have updated my efficient Threadpool engine with priorities and my Threadpool engine to version 3.0, i have come up with a new algorithm that is more optimized, in this new algorithm i have reduced and minimized the cache-line transfers so that the serial part of the Amdahl law has been reduced by 2 times, hope that you will be happy with this new efficient algorithm... i have also changed the concurrent FIFO queue of my efficient Threadpool engine with priorities, now the concurrent FIFO queue is fast, so it's really fast, so as you have noticed that designing and implementing an efficient Threadpool engine as my efficient Threadpool engine with priorities is somewhat a hard job, so to facilitate the reasonning about concurrent programming i have also a new efficient algorithm to facilitate the reasonning about "correctness", and i think that now that my new algorithm has facilitated the reasonning about correctness and now that i have tested it thoroughly , you can be more confident cause i think that my new algorithm of my efficient Threadpool engine with priorities is correct and stable now and it is also very fast.

And look at the ThreadPoolExecutor Class of Java, look for example at the awaitTermination() method, it says:

---

boolean awaitTermination(long timeout, TimeUnit unit)

Blocks until all tasks have completed execution after a shutdown request, or the timeout occurs, or the current thread is interrupted, whichever happens first.

--

read more here:

https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ThreadPoolExecutor.html#method.summary

Did you notice ?

In Java when you wait for the tasks you have to wait for "ALL" the tasks, and that's not efficient , and if you want to use the object from multiple threads i think it will have the same effect, you can avoid some of the problems by using many objects of the ThreadPoolExecutor class but this will take ressources and this will cause more and more context switches and that's bad, i think C# has the same problem, other than that Java and C# don't support priorities, it means that you can not give priorities to tasks/jobs, like high or normal or low, and that's not good for games and other applications where you have to use priorities even if the system is not a realtime system, this is why i have decided to implement my efficient Threadpool engine version 3.0 that supports those characteristics, so that you can create a child object of the Threadpool class that will use the same worker threads and that will wait only for the tasks that you will add with the execute() method , and also my efficient Threadpool engine supports 3 priorities, High and normal and low, that's where my efficient Threadpool engine comes in hand and that's where it's efficient. Hope you will like it.

Please read the HTML tutorial inside the zip file to understand how to use the execute() and wait() methods etc.

You have to know that to enlarge the stack of the worker threads of the Threadpool that use TThread, you have to set the stack size for the executable.

Look into defines.inc there is many options:

CPU32: for 32 bits architecture

CPU64: for 64 bits architecture

Please read the HTML tutorial inside the zip.

Look at test.pas demo inside the zip file...

Language: FPC Pascal v2.2.0+ / Delphi 5+: http://www.freepascal.org/

Operating Systems: Win , Linux and Mac (x86).

Required FPC switches: -O3 -Sd -dFPC -dWin32 -dFreePascal

-Sd for delphi mode....

Required Delphi switches: -DDelphi -DMSWINDOWS -$H+

For Delphi XE-XE7 use the -DXE switch

{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems

{$DEFINE CPU64} and {$DEFINE Windows64} for 64 bit systems

Note: testpool.pas is a parallel program of a Matrix multiply by a vector that uses SSE+ and it requires Delphi 5+. test.pas and test_thread.pas works with both FreePascal and Delphi.

Please click on the small arrow on the right of the zip file bellow to download...