Stay away from Volatile in threaded code?

In 2006 I was inspired by an (2001) Dr. Dobb's article by Andrei Alexandrescu how to use volatile as a type modifier, similar to const. Now I have grown in my understanding of volatile and although it can look attractive for threaded code you should think twice about why you want to use it.

Let's get some facts straight
  1. Reordering 1: Between volatiles, from C++ and the Perils of Double-Checked Locking.
    The C++ standard prevents compilers from reordering reads and writes to volatile within a thread
    , it imposes no constraints at all on such reorderings across threads.

  2. Reordering 2: According to Herb Sutter in Volatile Vs Volatile
    The compiler
    can reorder non-volatile read/write relative to volatile read/write according to the following
    • Ordinary reads can move in either direction across volatile read/write.
    • Ordinary writes cannot move at all across a volatile read/write.
    He also states that these two rules of thumb are compiler implementation specific. I.e. You better check the compiler documentation regarding this or not trust it at all.

    [For x86/x64 platforms this CPU reordering information might also be informative]

  3. For volatile variables to be useful for thread communication read/writes of these variables must be atomic.

  4. Volatile does not guarantee atomic reads or writes. (For atomic read/write together with volatile see my article at CodeProject).

  5. Volatile forces the compiler to generate code that performs actual memory reads and writes instead of caching values in registers. This means that all read/write to volatile is much slower than normal read/write operations.

  6. Volatile data is not compiler optimized. This a double edged sword, since the compiler usually is very good at making code more effective.

What is volatile good for?
The volatile keyword is not a threading or synchronization primitive in portable C or C++. It was mainly intended to
  1. Allow access to memory mapped devices
    I.e. pointers to data structure in coherent memory that can be modified by I/O devices.
  2. Variables in signal handlers and between setjmp and longjmp (ref: Volatile at  Wikipedia)

What can volatile be used for, even if it's not the "intention" of it?
It is better to use memory barriers and locks to guarantee correctness when reordering (CPU, Compiler, Memory)

That said, it is also not uncommon to use volatile for lock-free ring buffers used by network devices or other "input/output device" where the device will change the pointers to indicate what has been processed. I have also seen this in use in lock-free ring buffers that act as the message queue between interrupt driven firmware drivers and application layer.

A simplified example of this is my own example of a Lock-Free Single Producer Single Consumer queue. You can find it here or at CodeProject.

However, even this is actually a little like Russian Roulette. Even if the synchronization is not guaranteed by the volatile itself. The volatile only guarantees that volatile won't be reordered in relation to other volatile (ref here). So why is it Russian Roulette? I think one of Herb Sutters arguments in the comp.lang.c++.moderated entry "Am I or Alexandrescu wrong about singletons?" says it all:
Please remember this: Standard ISO C/C++ volatile is useless for  multithreaded programming. No argument otherwise holds water; at best the code may appear to work on some compilers/platforms

So what (else?) can we use volatile for? Well, Andrei Alexandrescu explains some of it in his article. I have summarized some of his points below at
"So what about Andrei's article about Volatile and LockingPtr?". I've also added some very strong complaints that this is absolutely wrong from the C++ standards view (even if it seemingly seems to work).

What is volatile NOT good for?

  1. Shared data protected by sound locking make volatile unnecessary and even harmful for your software.
    Shared data that is declared volatile still needs to be protected by locking, but since the shared data is declared volatile the compiler is prevented from optimizing access to it. When the lock is held the shared data cannot be modified by anyone else (i.e. is not volatile by nature) so making it volatile does nothing except slowing down your access.

  2. When using concurrent, threading code to achieve faster processing.
    Reasoning: Volatile is definitely counteractive here, it will be vastly more expensive than good non-volatile variables. A compute-intensive threaded software will have substantially more memory activity if using volatile. Memory access is what slows execution down and it is speed we was after, right?

  3. When using volatile as a means for thread synchronization.
    Reasoning: Basically it's just so easy to get it wrong. That volatile should be avoided. Global variables modified by interrupts or by threads is one example where it may be OK to use it. My conclusion is that if you see a variable made volatile in threaded code then that should be a red flag worth investigating.

    The references below are worth reading up on. They explain the intent and usage of volatile and when not to use it so much better than I have here.