https://habrahabr.ru/post/326138/ phreads
https://c9x.me/articles/gthreads/intro.html Green Threads
https://www.youtube.com/watch?v=MfEkOcMILDo C++ parallels
https://habrahabr.ru/post/328348/
https://habrahabr.ru/post/319350/
https://www.nextplatform.com/2017/01/05/essentials-multiprocessor-programming/
https://computing.llnl.gov/tutorials/parallel_comp/
https://habrahabr.ru/company/infopulse/blog/311134/
https://habrahabr.ru/post/318374/
https://www.kernel.org/doc/Documentation/memory-barriers.txt
http://joeduffyblog.com/2016/11/30/15-years-of-concurrency/
Video lectures from Leningrad:
https://www.youtube.com/watch?v=kbERSWTGtKw&list=PLlb7e2G7aSpQCPeKTcVBHJns_JOxrc_fT
http://preshing.com/20111124/always-use-a-lightweight-mutex/
http://concurrencyfreaks.blogspot.com/
https://www.lektorium.tv/lecture/13557
https://www.youtube.com/watch?v=kbERSWTGtKw&list=PLlb7e2G7aSpQCPeKTcVBHJns_JOxrc_fT
https://jackmott.github.io/programming/2016/08/30/think-before-you-parallelize.html
http://www.crankyotaku.com/2016/04/linux-programming-threading-basics.html
https://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git
http://kcsrk.info/ocaml/multicore/2016/06/11/lock-free/
https://rcrowley.org/2010/01/06/things-unix-can-do-atomically.html
http://ithare.com/implementing-queues-for-event-driven-programs/
http://c9x.me/art/gthreads/intro.html
http://coldattic.info/shvedsky/pro/blogs/a-foo-walks-into-a-bar/posts/tagged/concurrency
https://habrahabr.ru/post/278413/
https://habrahabr.ru/post/278661/
https://habrahabr.ru/post/278685/
http://preshing.com/20160201/new-concurrent-hash-maps-for-cpp/
https://habrahabr.ru/post/211717/
https://deadlockempire.github.io/
http://www.loic-yvonnet.com/articles/multithreading-in-cpp14-part-8/
https://news.ycombinator.com/item?id=10984616 in C++
http://zeroturnaround.com/rebellabs/what-are-fibers-and-why-you-should-care/ FIBER
https://medium.com/@markpapadakis/coroutines-and-fibers-why-and-when-5798f08464fd#.d1qzqhvvd
http://eli.thegreenplace.net/2016/c11-threads-affinity-and-hyperthreading/
http://www.amazon.com/gp/product/1451536615?ie=UTF8&tag=thepracofpara-20&linkCode=as2&camp=1789&creative=390957&creativeASIN=1451536615 Sergei Babkin book
http://research.microsoft.com/en-us/um/people/lamport/tla/hyperbook.html
http://dgoldblatt.com/going-from-lock-free-to-wait-free.html
http://concurrencyfreaks.blogspot.com/
http://habrahabr.ru/post/266901/
http://habrahabr.ru/post/271549/
http://habrahabr.ru/post/271583/
https://news.ycombinator.com/item?id=13102721
http://www.reddit.com/r/cpp/comments/3b8ku9/multiproducer_single_consumer_lockfree_queue/
https://news.ycombinator.com/item?id=9584099
http://jvns.ca/blog/2015/10/31/papers-are-amazing-profiling-threaded-programs-with-coz/
The most commonly taught method of ensuring data-integrity during concurrent operations is known as pessimistic locking, or placing a lock around the entire read and update operation.
Lock
Read
Transform
Update
Fig 1: Pessimistic Locking
Although this simple model does guarantee that no other writers can modify the data when one is holding the lock, it can lead to abysmal (really bad) performance on low-contention systems, where the writes and conflicts are infrequent.
What we hope to achieve when using optimistic locking is to minimise the time that we are within a lock, and by doing so, increasing our throughput. Because when we hold a lock, we stop all other threads from executing. To do this we need a system that satisfies the following assumptions:
There are very few conflicts. This is the most important factor and if it is not met, often renders optimistic locking a slower solution than pessimistic locking.
Data transformations are pure and have no side effects. That is, if we are given a transformation functionT(x) = k, for the same input x we always get k regardless of any external changes.
It is cheap to retry transactions with new versions of data. In some systems this may not hold true, especially if it is computationally intensive to calculate the next version of the data.
We can determine conflicts between writes easily and relatively cheaply. There are a few ways of doing this- via a version number which indicates how many times it has been written to, or using immutable data structures so checking if something is different is just using object identity.
The process of applying all changes or no changes at all is known as a transaction. Transactions guarantee (or theyshould) atomicity, which basically means all or nothing- you either get all changes or no changes. This helps maintain integrity of our data as well.
Read/Copy
Transform
Lock
Check
Swap
Fig 2: Optimistic Locking
Note that the transform step mutates/updates the local copied version. The check step validates whether the version that was copied is outdated. This can be done as previously mentioned via comparing the version or the object identity. If the check fails, the transaction is aborted and restarted. Else, the changes are committed.
!
Tthe CUDA kernel function is identified by the type qualifier __global__. The built-in variables threadIdx and blockIdx provide the indices for the local thread and block. They also allow a unique index to be computed for each 2D array element. Threads within the same block have access to 16k of on-chip common shared memory (declared with the __shared__ type qualifier), which has much lower latency than device memory. The __syncthreads() command provides a barrier such that all threads within a block synchronize before execution continues.
MULTICORE PPROCESSORS
http://en.wikipedia.org/wiki/Multi-core_processor
http://www.cpu-world.com/cgi-bin/CPUID.pl?CPUID=12373 Xeon E7- 4870
http://www.siliconmechanics.com/files/WestmereEXInfo.pdf
http://stackoverflow.com/questions/8547778/why-is-one-loop-so-much-slower-than-two-loops
Important: if we are working with versioned objects, remember to bump the version at the end of the compare-and-swap operation.
Although in simple systems we can just assume that there is only a binary option: abort or apply changes, in more complex systems we can have more options. Take the VCS tool, Git for example. If we commit to a local copy of the shared data (the repository) and someone else commits as well, we can either:
Win - our changes are committed and written to the repo.
Lose - we need to "replay" our changes atop the changed data structure; essentially, the latest version.
Merge changes - both our changes can be merged if they do not modify the same files. Else we would need to resolve the conflicts.
Having a side-effect free transformation is also important. For example if we had stored all the transformations performed on the data and we wanted an arbitrary version, we can just apply the transformations to the initial version. A concrete example:
changes = [ lambda x: x.append(1), lambda x: x.append(2) ] data = [] for item in changes: item(data) assert data == [1, 2]
No matter how many times we apply the changes, as long as we start from the same initial data we will always get the same final result. This is especially useful if, say we wanted to get the older version of our code. We wouldn't want that to differ due to some external conditions.
Let's see how a trivial Optimistic Locking implementation would look like in Python. First we will need some machinery for the versioned objects:
from threading import RLock, Thread class Versioned: def __init__(self, data): self.data = data self.version = 0 self.lock = RLock() def get(self): return self.data, self.version def commit(self, new_data, copied_version): with self.lock: if copied_version != self.version: raise Exception self.data = new_data self.version += 1
If you look at the Versioned.get method you will notice that there is no locking. There needn't be a lock because we are guaranteed that when the data is modified by the commit method, a lock will be held.
Then we need the threads to update their copies of the data and automatically retry failed transactions:
def task(): while True: try: data, version = obj.get() obj.commit(data + 1, version) break except: pass obj = Versioned(0) threads = [Thread(target=task) for _ in range(10)] [thr.start() for thr in threads] [thr.join() for thr in threads] assert obj.get() == (10, 10)
It won't solve all your concurrency issues, but when you get a situation which fits the four assumptions, please consider using optimistic locking.
http://jug.ru/meetings/308 Akka
There are several kinds of locks:
Blocking locks sleep a thread while it waits for another thread to release the lock. This is the usual behavior.
Spinlocks use a busy loop to constantly check to see if a lock has been released. This is more efficient if waiting is rare, but wastes CPU time if waiting is common.
Reader/writer locks allow multiple "reader" threads to enter a region simultaneously, but exclude all other threads (including readers) when a "writer" thread acquires the lock. This can be useful as many data structures are safe to read from multiple threads simultaneously, but unsafe to write while other threads are either reading or writing.
Recursive locks allow a single thread to acquire the same lock multiple times. Non-recursive locks can deadlock, crash, or otherwise misbehave when re-entered from the same thread.
https://www.arangodb.com/2015/08/lockfree-protection-of-data-structures-that-are-frequently-read/
http://greenteapress.com/semaphores/ Liittle book on Semaphores
http://preshing.com/20150316/semaphores-are-surprisingly-versatile/
http://www.vldb.org/pvldb/vol8/p209-yu.pdf
http://goparallel.sourceforge.net/wp-content/uploads/2015/02/MPI-Application-Tune-Up-r5.pdf
http://habrahabr.ru/post/250383/ concurrent hashmap
http://habrahabr.ru/post/250523/
http://habrahabr.ru/post/232361/ Book about distributed systems
https://www.justsoftwaresolutions.co.uk/threading/locks-mutexes-semaphores.html
http://deathbytape.com/post/110008612055/cpp-threading
http://queue.acm.org/detail.cfm?id=2698990
http://habrahabr.ru/company/intel/blog/243385/
https://tech.yandex.ru/events/yagosti/cpp-user-group/talks/1797/
http://levin-matveev.livejournal.com/84750.html
https://view.officeapps.live.com/op/view.aspx lockless
http://concurrencyfreaks.blogspot.com/2014/10/linked-lists-with-wait-free-reads-in-c.html
http://moodycamel.com/blog/2014/a-fast-general-purpose-lock-free-queue-for-c++
POSIX threads
http://habrahabr.ru/post/247903/
http://deathbytape.com/post/110008612055/cpp-threading
https://news.ycombinator.com/item?id=9013768
http://prog-books.livejournal.com/265.html
https://news.ycombinator.com/item?id=7815443
http://existentialtype.wordpress.com/2014/04/09/parallelism-and-concurrency-revisited/
http://rocksdb.org/blog/677/avoid-expensive-locks-in-get/ TLS
http://austingwalters.com/multithreading-common-pitfalls/
https://gist.github.com/staltz/868e7e9bc2a7b8c1f754 Reactive Programming
https://news.ycombinator.com/item?id=7964873
http://aseigo.blogspot.com/2014/07/multi-process-architectures-suck.html
Python
http://blog.pirx.ru/media/files/2014/python-concurrency-talk/python-concurrency.html#1
https://speakerdeck.com/trent/parallelism-and-concurrency-with-python
http://www.reddit.com/r/Python/comments/276xlw/why_coroutines_in_an_oop_language/
http://www.yosefk.com/blog/parallelism-and-concurrency-need-different-tools.html
http://diesel.io/ Non-blocking IO in python
https://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2014/From-Parallel-to-Concurrent
Everything You Always Wanted to Know About Synchronization but Were Afraid to Ask
http://sigops.org/sosp/sosp13/papers/p33-david.pdf
http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html
http://syprog.blogspot.com/2012/03/linux-threads-through-magnifier-local.html
http://syprog.blogspot.com/2012/03/linux-threads-through-magnifier-remote.html
http://habrahabr.ru/post/201826/ async
Actors
http://habrahabr.ru/company/tiktokcoach/blog/206300/
http://www.macs.hw.ac.uk/~rs46/posts/2014-02-03-objects-boxes-actors-agents.html
http://habrahabr.ru/post/143237/ multithreading 1
http://habrahabr.ru/post/209128/ multithreading 2
http://habrahabr.ru/post/200924/ Reader-Writer C++
https://github.com/Amanieu/asyncplusplus
http://habrahabr.ru/company/ifree/blog/196548/
http://habrahabr.ru/company/ifree/blog/195770/ lock-free
http://habrahabr.ru/company/ifree/blog/195948/
http://habrahabr.ru/post/174369/
http://natsys-lab.blogspot.ru/2013/05/lock-free-multi-producer-multi-consumer.html
http://blog.libtorrent.org/2012/12/principles-of-high-performance-programs/
http://software.intel.com/en-us/academic#pid-13124-91
http://www.ops.rsu.ru/en/about.shtml
http://habrahabr.ru/company/xakep/blog/210480/
volatile гарантирует только, что компилятор не будет кэшировать значение в регистре (оптимизирующие компиляторы очень любят это делать: чем больше регистров — тем больше кэшируют). То есть чтение volatile-переменной всегда означает чтение из памяти, запись volatile-перемнной — всегда запись в память
разделяемый объект называется lock-free объектом (неблокируемым, non-blocking объектом), если он гарантирует, что некоторый поток закончит выполнение операции над объектом за конечное число шагов вне зависимости от результата работы других потоков (даже если эти другие потоки завершились крахом).
Более строгое понятие wait-free объекта гласит: объект является wait-free, если каждый поток завершит операцию над объектом за конечное число шагов.
Условие lock-free гарантирует продвижение как минимум какого-то одного потока, тогда как более сильное условие wait-free гарантирует успешное выполнение для всех потоков.
http://michaelrbernste.in/2013/09/23/formalizing-concurrency-distribution-and-mobility.html
http://developerblog.redhat.com/2013/08/15/c-cpp11-parallelism/
http://www.youtube.com/watch?feature=player_embedded&v=EJOl9VDbslI
http://herbsutter.com/2013/02/11/atomic-weapons-the-c-memory-model-and-modern-hardware/
http://www.ibm.com/developerworks/aix/library/au-aix-multicore-multiprocessor/index.html
http://pavpanchekha.com/blog/concurrency.html
http://pavpanchekha.com/blog/time-clocks.html
http://www.teigfam.net/oyvind/home/technology/072-pike-sutter-concurrency-vs-concurrency/
http://www.youtube.com/watch?v=f6kdp27TYZs
http://si14.livejournal.com/42022.html
https://news.ycombinator.com/item?id=5711232
https://news.ycombinator.com/item?id=6038341
http://queue.acm.org/detail.cfm?id=2513575
http://habrahabr.ru/post/184436/ C++
http://habrahabr.ru/post/185706/ C++
http://habrahabr.ru/post/186200/ C++
http://www.infoq.com/news/2013/07/core-async Closure and ClosureScript
http://queue.acm.org/detail.cfm?id=2492433 non-blocking algorithms
http://rfw.name/blog/2013/07/13/asynchrony.html ASYNC
Java
http://lmax-exchange.github.io/disruptor/
C#
http://higherlogics.blogspot.ca/2013/09/clr-concurrency-preventing-torn-reads.html
http://msdn.microsoft.com/en-us/library/windows/desktop/ms684251(v=vs.85).aspx
http://www.albahari.com/threading/
http://www.codeproject.com/Articles/152765/Task-Parallel-Library-1-of-n C#
http://www.codeproject.com/Articles/362996/Multi-core-programming-using-Task-Parallel-Library
http://stackoverflow.com/questions/6584397/how-to-start-a-thread-at-specific-coresolved
http://www.chrisstucchio.com/blog/2013/why_not_python.html
https://medium.com/code-adventures/174f1fe66127
https://justin.harmonize.fm/Development/2008/09/09/threading-model-overview.html
http://kachayev.github.com/talks/kharkivpy%230/index.html#/
https://blog.heroku.com/archives/2013/2/24/concurrency_is_not_parallelism/
http://www.infoq.com/interviews/coutts-haskell
http://journal.stuffwithstuff.com/2013/02/24/iteration-inside-and-out-part-2/
http://erratasec.blogspot.co.uk/2013/02/multi-core-scaling-its-not-multi.html
http://ak.channel9.msdn.com/ch9/b54e/8ceeaa3d-551e-4184-a964-9fd4012cb54e/GoingNative2012ThreadsSharedVar_med_ch9.mp4
http://www.reddit.com/r/programming/comments/1bm16t/simon_peyton_jones_explains_how_to_deal_with/
http://www.yosefk.com/blog/checkedthreads-bug-free-shared-memory-parallelism.html
http://blog.paralleluniverse.co/post/44146699200/spaceships
http://concurrencykit.org/index.html
http://preshing.com/20130605/the-worlds-simplest-lock-free-hash-table
https://habrahabr.ru/post/317882/ lock free list
Blocking
http://habrahabr.ru/company/nordavind/blog/176541/
TLS thread local storage
http://en.wikipedia.org/wiki/Thread-local_storage
http://itw66.ru/blog/c_plus_plus/537.html
Green Threads
http://c9x.me/gthreads/intro.html
Wiki
http://en.wikipedia.org/wiki/Readers-writers_problem
http://en.wikipedia.org/wiki/Producers-consumers_problem
http://en.wikipedia.org/wiki/Peterson's_algorithm
http://en.wikipedia.org/wiki/Non-blocking_algorithm
http://en.wikipedia.org/wiki/Synchronization_(computer_science)
http://en.wikipedia.org/wiki/Compare-and-swap
http://en.wikipedia.org/wiki/Non-blocking_synchronization
http://en.wikipedia.org/wiki/Memory_barrier
http://en.wikipedia.org/wiki/Lock-free_and_wait-free_algorithms
http://en.wikipedia.org/wiki/Inter-process_communication
https://github.com/kaisellgren/Concurrency-concepts
http://habrahabr.ru/post/164487/ in Java
Lock-free algorithms
http://www.yosefk.com/blog/its-locking-if-its-blocking.html
http://habrahabr.ru/post/182722/
http://habrahabr.ru/company/ifree/blog/219201/ Queue
http://msdn.microsoft.com/en-us/library/windows/desktop/ee418650(v=vs.85).aspx
http://preshing.com/20120612/an-introduction-to-lock-free-programming
http://www.1024cores.net/home/lock-free-algorithms/introduction
http://woboq.com/blog/introduction-to-lockfree-programming.html
http://www.infoq.com/presentations/Lock-Free-Algorithms
Lock-free queue:
http://habrahabr.ru/post/174369/
http://moodycamel.com/blog/2013/a-fast-lock-free-queue-for-c++
http://www.drdobbs.com/article/print?articleId=210604448&siteSectionName=parallel
http://www.drdobbs.com/article/print?articleId=211601363&siteSectionName=parallel
http://www.drdobbs.com/article/print?articleId=212201163&dept_url=/cpp/
http://www.reddit.com/r/programming/comments/yhv2v/writing_lockfree_code_a_corrected_queue/
http://dev.hasenj.org/post/31396790746/coroutines-vs-explicitly-async-apis
http://jordanorelli.tumblr.com/post/31533769172/why-i-went-from-python-to-go-and-not-node-js
http://www.codeproject.com/Articles/153898/Yet-another-implementation-of-a-lock-free-circular
http://habrahabr.ru/company/ideco/blog/147086/#habracut
http://habrahabr.ru/post/147107/
http://habrahabr.ru/post/147099/
http://habrahabr.ru/post/150801/
http://habrahabr.ru/post/118850/ Lock-free
http://habrahabr.ru/post/118515/ Lock-free
http://habrahabr.ru/post/118390/
http://habrahabr.ru/post/130101/ Non-blocking queue
http://habrahabr.ru/post/118917/
http://habrahabr.ru/post/118466/
http://habrahabr.ru/post/108857/ Observer in C++
CONTINUATIONS
http://blog.rfw.name/2012/10/11/continuations.html
http://matt.might.net/articles/by-example-continuation-passing-style/
http://blogs.msdn.com/b/wesdyer/archive/2007/12/22/continuation-passing-style.aspx
https://speakerdeck.com/u/multicoreworld/p/james-reinders-intel-united-states
http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html
http://mortoray.com/2011/12/16/how-does-a-mutex-work-what-does-it-cost/
http://www.albahari.com/threading/
http://scalibq.wordpress.com/2012/06/01/multi-core-and-multi-threading/
http://www.ibm.com/developerworks/aix/library/au-multithreaded_structures1/index.html
http://www.infoq.com/presentations/Thinking-Parallel-Programming
http://yow.eventer.com/events/1004/talks/1055
http://blog.incubaid.com/2012/03/28/the-game-of-distributed-systems-programming-which-level-are-you/
http://www.alexonlinux.com/multithreaded-simple-data-type-access-and-atomic-variables
http://www.youtube.com/watch?v=yHi6tPwAy98 Java multiprocessor programming
http://www.reddit.com/r/programming/comments/zv8ib/coroutines_vs_explicitly_async_apis/
http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html BOOK
http://attractivechaos.wordpress.com/2011/10/06/multi-threaded-programming-efficiency-of-locking/
http://www.igvita.com/2010/08/18/multi-core-threads-message-passing/
http://video.ch9.ms/ch9/0579/28a77412-2541-4857-97f1-9f8e012b0579/SPLASH2011DavidUngar_ch9.wmv
http://glyph.twistedmatrix.com/2012/01/concurrency-spectrum-from-callbacks-to.html
Java and Scala Concurrency
http://channel9.msdn.com/Shows/Going+Deep/Hewitt-Meijer-and-Szyperski-The-Actor-Model-everything-you-wanted-to-know-but-were-afraid-to-ask
http://nichol.as/zeromq-an-introduction
http://think-async.com/Asio/AsioAndBoostAsio
Pipes
The constant PIPE_SIZE establishes the number of bytes allocated for a pipe (the size of the pipe buffer; in fact, the size of pipe buffer is PAGE_SIZE). The constant PIPE_BUF (limits.h) defines the atomic operational limit (atomic writes to a pipe).
Parallel compression
http://www.iasylum.net/writings/parallel-compression.html
OpenMP
http://habrahabr.ru/blogs/cpp/134547/
http://software.intel.com/en-us/articles/intel-guide-for-developing-multithreaded-applications/
C++11
http://en.cppreference.com/w/cpp/atomic
http://www.baptiste-wicht.com/2012/07/c11-concurrency-tutorial-part-4-atomic-type/
http://channel9.msdn.com/Events/GoingNative/GoingNative-2012/Threads-and-Shared-Variables-in-C-11
http://www.corensic.com/Learn/Resources/ConcurrencyTutorialPartTwo.aspx C++ 11
http://solarianprogrammer.com/2011/12/16/cpp-11-thread-tutorial/
http://www.informit.com/articles/printerfriendly.aspx?p=1750198
http://return1.net/blog/2012/Jun/14/multithreading-with-c11-protecting-data
Coroutines
http://en.wikipedia.org/wiki/Coroutine
http://habrahabr.ru/post/143318/
http://www.embeddedrelated.com/showarticle/455.php
http://www.1024cores.net/home/lock-free-algorithms/so-what-is-a-memory-model-and-how-to-cook-it
http://blog.ashodnakashian.com/2011/07/almost-everything-you-read-about-threading-is-wrong/
http://www.codeproject.com/KB/threads/
http://mejedi.livejournal.com/15674.html
http://slowfrog.blogspot.com/2011/05/taking-advantage-of-multiple-cores-with.html
epool select
http://banu.com/2011/06/how-to-use-epoll-complete-example/
http://www.kegel.com/c10k.html
http://www.wangafu.net/~nickm/libevent-book/TOC.html
http://www.nightmare.com/medusa/async_sockets.html
Callbacks: sync and unsync:
http://blog.ometer.com/2011/07/24/callbacks-synchronous-and-asynchronous/
http://news.ycombinator.com/item?id=2907856
http://swtch.com/~rsc/talks/threads07/
http://www.usenix.org/events/hotos03/tech/vonbehren.html
All threads in a process share the same address space.
In other words, data in a static or global variable is normally always located at the same memory location, when referred to by threads from the same process.
Variables on the stack however are local to threads, because each thread has its own stack, residing in a different memory location.
Atomic operations
http://jfdube.wordpress.com/2011/11/30/understanding-atomic-operations/
http://en.cppreference.com/w/cpp/atomic
http://habrahabr.ru/post/157163/
CAS: compare and swap
CAS: Atomicity guaranteed for the following 3 steps together
Read the original value from a memory location.
Compute the new value to be set.
Set the new value only if the memory location is still the original value.
Deadlock occurs when two or more threads are blocked while waiting for each other. For example, the first thread is blocked on the second thread, waiting for a resource that the second thread holds. The second thread does not release this resource until it acquires a resource held by the first thread.
An operation that cannot be interrupted is said to be atomic.
Most machines have a Test-and-Set (TS) machine instruction, which returns the old value of a shared variable and assigns a constant value to it in one atomic action.
http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Atomic-Builtins.html
http://en.wikipedia.org/wiki/Compare-and-swap
Critical section: a series of statements that must be performed atomically, and only executed by one thread at a time
The mutex (mutual exclusion) guarantees that only one thread accesses the shared variable at a time.
http://stackoverflow.com/questions/62814/difference-between-binary-semaphore-and-mutex
http://stackoverflow.com/questions/62814/difference-between-binary-semaphore-and-mutex/86021#86021
The synchronized keyword in Java implements mutual exclusion.
When a synchronized method is called, the calling thread attempts to acquire an exclusive lock on the object. If any other thread holds that lock, then the calling thread stalls until the lock is released. However, this solution is not wise because it can lead to deadlock.
http://book.opensourceproject.org.cn/embedded/cmprealtime/index.html
http://www.cs.duke.edu/courses/cps110/spring99/slides/
http://pages.cs.wisc.edu/~swift/classes/cs537-sp09/lectures.html
http://cseweb.ucsd.edu/classes/fa05/cse120/lectures/
deadlock
livelock
starvation
spinlock - A mutex that uses busy waiting is called a spinlock.
Semaphore
1. When you create the semaphore, you can initialize its value to any integer,
but after that the only operations you are allowed to perform are increment
(increase by one) and decrement (decrease by one). You cannot read the
current value of the semaphore.
2. When a thread decrements the semaphore, if the result is negative, the
thread blocks itself and cannot continue until another thread increments
the semaphore.
3. When a thread increments the semaphore, if there are other threads waiting,
one of the waiting threads gets unblocked.
At any time, the value of the semaphore represents the number of additional
threads that may enter. If the value is zero, then the next thread will block
until one of the threads inside exits and signals. When all threads have exited
the value of the semaphore is restored to n.
Barrier synchronization comprises three actions:
- task posts its arrival at the barrier,
- the task waits for other participating tasks to reach the barrier, and
- task receives notification to proceed beyond the barrier.
typedef struct {
mutex_t br_lock; /* guarding mutex */
cond_t br_cond; /* condition variable */
int br_count; /* num of tasks at the barrier */
int br_n_threads; /* num of tasks participating in the barrier synchronization */
} barrier_t;
barrier(barrier_t *br){
mutex_lock(&br->br_lock);
br->br_count++;
if (br->br_count < br->br_n_threads)
cond_wait(&br->br_cond, &br->br_lock);
else{
br->br_count = 0;
cond_broadcast(&br->br_cond);
}
mutex_unlock(&br->br_lock);
}
Condition variables
condition variables, a synchronization mechanism that works in partnership with monitors or with mutexes used in the style of monitors. There are two basic operations on a condition variable: wait and notify. (Some systems use the name signal instead of notify.)
http://stackoverflow.com/questions/70773/pthread-cond-wait-versus-semaphore
When a task examines a condition variable, the task must have exclusive access to that condition variable. Without exclusive access, another task could alter the condition variable's conditions at the same time, which could cause the first task to get an erroneous indication of the variable's state. Therefore, a mutex is always used in conjunction with a condition variable. The mutex ensures that one task has exclusive access to the condition variable until that task is finished with it.
POSIX condition variables are initialized with pthread_cond_init independent
of any particular mutex; the mutex is instead passed as an argument
to pthread_cond_wait, along with the condition variable being
waited on. This is a somewhat error-prone arrangement, because all concurrent
waiters need to pass in the same mutex.
The operations corresponding to Java's Object methods notify() and notifyAll() are called pthread_cond_signal and pthread_cond_broadcast.
The API allows a thread to invoke pthread_cond_signal or pthread_cond_broadcast without holding a corresponding mutex, but using this exibility without introducing a race bug is diffi.cult.
Producer-consumer problem
http://msdn.microsoft.com/en-us/magazine/cc163599.aspx
Producer create items of some kind and add them to a data structure; consumers remove the items and process them.
Example: Event-driven programs. An “event” may be: the user presses a key or moves
the mouse, Whenever an event occurs, a producer thread creates an event object and
adds it to the event buffer. Concurrently, consumer threads take events out of the buffer and process them. In this case, the consumers are called “event handlers.”
Readers-writers problem
1. Any number of readers can be in the critical section simultaneously.
2. Writers must have exclusive access to the critical section
In other words, a writer cannot enter the critical section while any other
thread (reader or writer) is there, and while the writer is there, no other thread may enter.
A readers/writers lock is much like a mutex, except that when a thread
locks the lock, it speci.es whether it is planning to do any writing to the
protected data structure or only reading from it. Just as with a mutex,
the lock operation may not immediately complete; instead, it waits until
such time as the lock can be acquired. Any number of
readers can hold the lock at the same time, they
will not wait for each other. A reader will wait, however, if a writer holds
the lock. A writer will wait if the lock is held by any other thread, whether
by another writer or by one or more readers.
Readers/writers locks are particularly valuable in situations where some
of the read-only operations are time consuming, as when reading a fi.le stored
on disk. This is especially true if many readers are expected. The choice
between a mutex and a readers/writers lock is a performance trade-off..
Because the mutex is simpler, it has lower overhead. However, the readers/
writers lock may pay for its overhead by allowing more concurrency
Starvation
Thread starvation: the is the possibility that one thread might wait indefinitely while others proceed.
For most concurrent applications, starvation is unacceptable, so we enforce
the requirement of bounded waiting, which means that the time a thread
waits on a semaphore (or anywhere else, for that matter) has to be provably
finite.
If a writer arrives while there are readers in the critical section, it might wait in queue forever while readers come and go. As long as a new reader arrives before the last of the current readers departs, there will always be at least one reader in the room. This situation is not a deadlock, because some threads are making progress, but it is not exactly desirable. A program like this might work as long as the load on the system is low, because then there are plenty of opportunities for the writers. But as the load increases the behavior of the system would deteriorate quickly (at least from the point of view of writers).
Pthreads
http://www.linuxforu.com/2011/08/light-weight-processes-dissecting-linux-threads/
https://computing.llnl.gov/tutorials/pthreads/
http://www.ibm.com/developerworks/library/l-posix3/
http://www.ibm.com/developerworks/aix/library/au-multithreaded_structures1/index.html
http://www.ibm.com/developerworks/linux/library/l-memory-leaks/index.html
There are around 100 Pthreads procedures, all prefixed "pthread_" and they can be categorized into four groups:
1) Thread management - creating, joining threads etc
2) Mutexes
3) Condition variables
4) Synchronize between threads using read/write locks and barriers
The POSIX standard, for example, includes readers/writers locks with procedures such
as pthread_rwlock_init, pthread_rwlock_rdlock, pthread_rwlock_wrlock,
and pthread_rwlock_unlock. The POSIX standard leaves it up to each individual
system how to prioritize new readers versus waiting writers.
The POSIX standard also includes a more specialized form of readers/writers locks specifi.cally associated with fi.les.
In the POSIX standard, file locks are available only through the complex fcntl procedure.
However, most UNIX-family operating systems also provide a simpler interface, flock.
Python
http://blog.incubaid.com/2012/04/02/tracking-asynchronous-io-using-type-systems/
http://docs.python.org/library/threading.html
http://docs.python.org/library/queue.html
http://www.doughellmann.com/PyMOTW/threading/
http://pypi.python.org/pypi/proconex/0.3
https://github.com/jango/calculon
http://dabeaz.blogspot.com/2009/09/python-thread-synchronization.html
http://javouhey.wordpress.com/2011/01/16/concurrency-use-case-producer-consumer/
http://kurt.seifried.org/2010/05/31/python-performance-part-1/
http://appcrawler.com/wordpress/2011/01/19/python-multiple-producerconsumer-queue/
http://www.oluyede.org/blog/2007/04/09/producerconsumer-in-python/
http://effbot.org/zone/thread-synchronization.htm
http://www.java2s.com/Tutorial/Python/0340__Thread/Catalog0340__Thread.htm
http://bytes.com/topic/python/answers/803276-producer-consumer-threading-problem
Links
http://members.verizon.net/~babkin/tpopp/
http://en.wikipedia.org/wiki/Mutex http://en.wikipedia.org/wiki/Spinlock
http://en.wikipedia.org/wiki/Monitor_%28synchronization%29
http://en.wikipedia.org/wiki/Producers-consumers_problem
http://www.justsoftwaresolutions.co.uk/threading/
http://www.owlnet.rice.edu/~comp422/lecture-notes/
http://softpixel.com/~cwright/programming/threads/
http://stackoverflow.com/questions/62814/difference-between-binary-semaphore-and-mutex
http://www.dalkescientific.com/writings/diary/archive/2007/10/07/wide_finder.html
http://memojo.com/~sgala/blog/2007/09/29/Python-Erlang-Map-Reduce
http://code.google.com/p/hpcource/wiki/Materials
http://www.multicoreinfo.com
WideFinder
http://wikis.sun.com/display/WideFinder/Results
http://wikis.sun.com/display/WideFinder/Wide+Finder+Home
http://effbot.org/zone/wide-finder.htm
GPU Programming - CUDA
http://gpgpu.org/ http://developer.nvidia.com/object/cuda_3_0_downloads.html
http://developer.nvidia.com/object/cuda_training.html
A kernel—a function callable from the host and executed on the device by many threads in parallel.