Reflecting on https://www.linkedin.com/learning/parallel-and-concurrent-programming-with-c-plus-plus-part-1
Sequential programming is when a single processor executes a series of instructions one after another whereas parallel programming is when multiple processors execute different parts of a specific task simultaneously. Parallel execution increases overall throughput but with extra effort due to the need for coordination and communication between processors.
Flynn's Taxonomy: a system that classifies multiprocessor architecture into four classes based on the number of concurrent instruction streams and data streams.
SISD (Single Instruction Single Data): The simplest architecture, a sequential computer with a single processor.
SIMD (Single Instruction Multiple Data): A parallel computer with multiple processors executing the same instruction but can each operate on different data. (used in GPUs to exec same instruction on a massive data set like image processing)
MISD (Multiple Instruction Single Data): Multiple processors each executing its own separate series of instructions but all operating on the same stream of data. (Not Commonly Used Architecture)
MIMD (Multiple Instruction Multiple Data): Multiple processors executing a different series of instruction and can operate on a different set of data. (Most Commonly Used, like Multi-Core PCs)
SPMD (Single Program Multiple Data): Multiple processors executing the same single program simultaneously, but can each use different data. (Most common style of parallel programming)
MPMD (Multiple Program Multiple Data): Multiple processors executing different programs at the same time while using different data.
Shared Memory: all processors have access to the same memory as part of a global address space and they see everything that happens in the shared memory space.
Uniform Memory Access (UMA): all processors can access memory equally fast in a uniform way. The most common UMA architecture is the Symmetric Multiprocessing (SMP) system, where the system has 2 or more identical processors connected to a single shared memory often through a system bus. It has cache coherency issue which is handled by hardware.
Non-uniform Memory Access (NUMA): Often made by connecting multiple SMP systems together. Every processor can see everything in memory but the access is non-uniform meaning that some processors will be quicker than others.
Distributed Memory: Each processor has its own local memory with its own address space. All processors are connected through some sort of network and each operates independently with no changes reflected in the memory of other processors. The advantage here is that these systems are scalable.
A Process consists of the program's code, data and state information. Within every process there are one or more sub-elements called Threads and each one of them is an independent path of execution through the program. They can only exist as part of a process. Threads that belong to the same process, share the process's address space. Generally, threads are lightweight compared to processes and it requires less overhead to create and terminate. OS can switch between threads faster than processes.
Concurrency: refers to the ability of a program to be broken into parts that can run independently of each other without affecting the end result. Concurrent Execution is when two independent processes overlap in time and that is not parallel execution. Parallel execution needs parallel hardware.
Concurrency is about the structure of the program, dealing with multiple things at once.
Parallelism is about simultaneous execution, actually doing multiple things at once.
A concurrent program is not inherently parallel.
The OS controls thread execution by a scheduler that assigns processes and threads to run on available CPUs. When a process is created it gets loaded into memory placed in the ready queue, then the scheduler cycles through the ready processes so they get a chance to execute on the processor.
Scheduling Algorithms for context switching: First come, first served - Shortest job next - Priority - Shortest remaining time - Round-robin - Multiple-level queues
A thread will begin in the New state once created. A thread is in the Runnable state when it starts which means OS can schedule it for execution. A thread goes into a Blocked state when it has to wait for an event to occur, while blocked it doesn't use any CPU resources. A thread can wait for another to finish its task by calling the join() method. A thread enters a Terminated state when it completes executing or gets aborted.
A Daemon (Background) Thread does not prevent the process from terminating. Calling detach() on a thread makes it run independently and non-joinable.
Is a problem that occurs when two or more concurrent threads access the same memory location and at least one of them is writing to that memory location.
Critical Section: a code segment that may not operate correctly if it accesses a shared resource.
Mutex (Lock): a mechanism to implement mutual exclusion where only one thread can possess the lock so it can be used to prevent multiple threads from accessing a shared resource forcing them to take turns.
Deadlock: happens when a thread enters a waiting list due to trying to lock a mutex that it's already locked.
Reentrant Mutex: is a mutex that can be locked multiple times by the same thread or process. It has to be unlocked an equal number of times before another thread can lock it.
Try Lock is a non-blocking version of the lock function, if the mutex is available