Kernel mode synchronization constructs are provided by operating system kernel should only be used in two conditions.
Inter process and Inter App Domain Synchronization
We need to implement synchronization in between two different App Domain.Best example is Single Instance Application that uses kernel synchronization construct to detect present of other running instance before starting. Example of such applications are OUTLOOK etc.
To Avoid Live Lock
When we need to avoid live lock that waste significant amount of CPU cycles we can go for kernel construct. a common example it to download a file from web and server takes very long to response, user construct will waste significant amount of CPU in this case.
There are three types of kernel synchronization construct , EVENT,SEMAPHORE,MUTEX. Event and Semaphore are primitive construct and Mutex is written on the top of theses.
Kernel synchronization construct are actually Win32 structure and represented by Windows Handle (32 Bit Number) but Dot.Net provides a wrapper class over them.
In Dot.Net events are represented by WaitHandle a wrapper over Win32 native thread handle. WaitHandle is usually used as base class for its derived class objects. An event can be either if two states
SET , Also called True or signaled state >> Allow other threads to access resource.
RESET, Also called False or non-signaled state >> Do not allow other threads to access resource.
Auto Reset Event
This is actually derived from WaitHandle class (WaitHandle>>AutoResetEvent) and allow just one threads to use the resource. Infact ASAP one thread access this event and started executing OS turn the state to Reset (Non Signaled or false) and as a result all other thread will be waiting. This construct is mainly used to implement read one- write one model.
Manual Reset Event
This is almost same as AutoResetEvent but it can allow multiple thread to acces same resource. Actually OS does not reset it`s state when it is access by any thread in signaled state. At least one participating thread must call Reset method to turn the state non signaled. Most common use of this construct is to implement Write one Read Many model.
Semaphore
Use the Semaphore class to control access to a pool of resources. Threads enter the semaphore by calling the WaitOne method, which is inherited from the WaitHandle class, and release the semaphore by calling the Release method.
The count on a semaphore is decremented each time a thread enters the semaphore, and incremented when a thread releases the semaphore. When the count is zero, subsequent requests block until other threads release the semaphore. When all threads have released the semaphore, the count is at the maximum value specified when the semaphore was created.
There is no guaranteed order, such as FIFO or LIFO, in which blocked threads enter the semaphore.
A thread can enter the semaphore multiple times, by calling the WaitOne method repeatedly. To release some or all of these entries, the thread can call the parameterless Release() method overload multiple times, or it can call the Release(Int32) method overload that specifies the number of entries to be released.
Semaphores are of two types: local semaphores and named system semaphores.
System semaphores: If you create a Semaphore object using a constructor that accepts a name, it is associated with an operating-system semaphore of that name. Named system semaphores are visible throughout the operating system, and can be used to synchronize the activities of processes. You can create multiple Semaphore objects that represent the same named system semaphore, and you can use the OpenExisting method to open an existing named system semaphore.
local semaphore : exists only within your process. It can be used by any thread in your process that has a reference to the local Semaphore object. Each Semaphore object is a separate local semaphore.
Mutex
When two or more threads need to access a shared resource at the same time, the system needs a synchronization mechanism to ensure that only one thread at a time uses the resource. Mutex is a synchronization primitive that grants exclusive access to the shared resource to only one thread. If a thread acquires a mutex, the second thread that wants to acquire that mutex is suspended until the first thread releases the mutex.
The Mutex class enforces thread identity, so a mutex can be released only by the thread that acquired it. By contrast, the Semaphore class does not enforce thread identity.
If a thread terminates while owning a mutex, the mutex is said to be abandoned. The state of the mutex is set to signaled, and the next waiting thread gets ownership.
Mutexes are of two types: local mutexes, which are unnamed, and named system mutexes.
A local mutex exists only within your process. It can be used by any thread in your process that has a reference to the Mutex object that represents the mutex. Each unnamed Mutex object represents a separate local mutex.
Named system mutexes are visible throughout the operating system, and can be used to synchronize the activities of processes. You can create a Mutex object that represents a named system mutex by using a constructor that accepts a name. The operating-system object can be created at the same time, or it can exist before the creation of the Mutex object. You can create multiple Mutex objects that represent the same named system mutex, and you can use the OpenExisting method to open an existing named system mutex.
Mutex and Terminal Service
On a server that is running Terminal Services, a named system mutex can have two levels of visibility. If its name begins with the prefix "Global\", the mutex is visible in all terminal server sessions. If its name begins with the prefix "Local\", the mutex is visible only in the terminal server session where it was created. In that case, a separate mutex with the same name can exist in each of the other terminal server sessions on the server.
If you do not specify a prefix when you create a named mutex, it takes the prefix "Local\". Within a terminal server session, two mutexes whose names differ only by their prefixes are separate mutexes, and both are visible to all processes in the terminal server session.
That is, the prefix names "Global\" and "Local\" describe the scope of the mutex name relative to terminal server sessions, not relative to processes.
Volatile Keyword (Atomic Read/Write)
Usually CPU reads value of any variable from cache if present or else it access main memory for fetching the value. This may cause inconsistency in between cached and in-memory values of same variable and multiple threads are accessing the variable in read write mode. C# provide a volatile modifier that is usually used for a field that is accessed by multiple threads without using the lock statement to serialize access. Using the volatile modifier ensures that one thread retrieves the most up-to-date value written by another thread.
Volatile read Vs Volatile Write
A read of a volatile field is called a volatile read. A volatile read has "acquire semantics"; that is, it is guaranteed to occur prior to any references to memory that occur after it in the instruction sequence.
A write of a volatile field is called a volatile write. A volatile write has "release semantics"; that is, it is guaranteed to happen after any memory references prior to the write instruction in the instruction sequence.
Note :- This is most efficient synchronization when we need to synchronize simple read write operations and do not block any thread.
Interlocked methods: perform atomic read and write and operations
There are several situation in which we just need simple mathematical operations like add and subtract in thread safe way. A well known example of such operations is instance counter. System.Threading.Interlocked class provide several simple methods that provide atomic manipulation of several simple operations. Interlocked.Increment and Interlocked.Decrement are most commonly used operation of the class.
The methods of this class help protect against errors that can occur when the scheduler switches contexts while a thread is updating a variable that can be accessed by other threads, or when two threads are executing concurrently on separate processors. The members of this class do not throw exceptions. And also do not block any thread as result it should be preferred as synchronization construct.
Monitor Class
It is a Dot.Net equivalent of Win32 Critical Section and used to lock a section of application code. it is most frequently used synchronization construct. Also see lock keyword below.
The Monitor class controls access to objects by granting a lock for an object to a single thread. Object locks provide the ability to restrict access to a block of code, commonly called a critical section. Use the Enter and Exit methods to mark the beginning and end of a critical section.
If the critical section is a set of contiguous instructions, then the lock acquired by the Enter method guarantees that only a single thread can execute the enclosed code with the locked object. In this case, it is recommended you place those instructions in a try block and place the Exit instruction in a finally block.
Note : Use Monitor to lock objects (that is, reference types), not value types.
C# Lock keyword
it is actually a wrapper on Monitor class that wraps Monitor.Enter and Monitor.Exit to implement critical section.
Lock Keyword Best Practices
In general, avoid locking on a public type, or instances beyond your code's control. The common constructs lock (this), lock (typeof (MyType)), and lock ("myLock") violate this guideline
Best practice is to define a private object to lock on, or a private static object variable to protect data common to all instances.
The ManualResetEventSlim and SemaphoreSlim class.
Both Slim construct constructs work exactly like their kernel-mode counterparts except that they employ spinning in user mode and defer creating the kernel-mode construct until the first time contention occurs. Their Wait methods allow you to pass a timeout and a Cancellation Token.
use these constructs if your code is not blocking the thread for log time and synchronization code is not being called very frequently
ReaderWriterLock
Defines a lock that supports single writers and multiple readers.This class is not recommended for new development.
The ReaderWriterLockSlim Class
Represents a lock that is used to manage access to a resource, allowing multiple threads for reading or exclusive access for writing.
Use ReaderWriterLockSlim to protect a resource that is read by multiple threads and written to by one thread at a time. ReaderWriterLockSlim allows multiple threads to be in read mode, allows one thread to be in write mode with exclusive ownership of the lock, and allows one thread that has read access to be in upgradeable read mode, from which the thread can upgrade to write mode without having to relinquish its read access to the resource.
ReaderWriterLockSlim is similar to ReaderWriterLock, but it has simplified rules for recursion and for upgrading and downgrading lock state. ReaderWriterLockSlim avoids many cases of potential deadlock. In addition, the performance of ReaderWriterLockSlim is significantly better than ReaderWriterLock. ReaderWriterLockSlim is recommended for all new development.
The CountdownEvent Class
Represents a synchronization primitive that is signaled (allow thread to execute) only when its count reaches zero. This is exactly reverse of semaphore that allow execution till count reaches Zero. This construct is a good replacement of recursive locks and reduces the chances of dead lock. Use this class when you need to took recursive lock in some algorithms like QuickShort
Other Classes
There are couple of other classes in System.Threading namespace but omitting them because they are not of much use.
The OneManyLock Class
This is almost same as ReaderWriterLockSlim class but have several performance benefit over ReaderWriterLockSlim