LINUX OS
Processes
What is a process?
Taken mostly from devconnected and Redhat
A process is an instance (copy) of a computer program that is currently being executed. A process has a owner and is identified by a process ID (also called PID) .
On the other hand, programs are lines of code or lines of machine instructions stored on a permanent data storage (a disk or a USB). They can just reside on your data storage, or they can be in execution; then they are running as processes.
Each individual process runs in its own virtual address space and is not capable of interacting with another process except through secure, kernel managed mechanisms; the code of the LINUX Operating System.
The stack is a segment of memory where data like your local variables and function calls get added and/or removed
During the lifetime of a process it will use many system resources:
It will use the CPUs in the system to run its instructions
It will use the system's physical memory to hold it and its data.
It Will open and use files within the filesystems
It may directly or indirectly use the physical devices in the system.
Linux must keep track of the process itself and of the system resources that it has so that it can manage it and the other processes in the system fairly. It would not be fair to the other processes in the system if one process monopolized most of the system's physical memory or its CPUs.
Processes life cycle
During the life of a process, it can go through different states. To help understand the various states of a process, compare a Linux process to a human being. Every human being has different stages of life.
The life cycle begins with the parents giving birth to an offspring (synonymous to a process being forked by its parent process).
After birth, humans start living their lives in their surroundings and start using available resources for their survival (synonymous to a process being in a Running state).
At some point in the life cycle, humans need to wait for something that they must have before they can continue to the next step in their lives (this is synonymous to a process being in the Sleep state).
And just as every human life must come to an end, every process must also die at some point in time.
Process states
As a process executes it changes state according to its circumstances. Linux processes have the following states:
(R) Running
The process is either running (it is the current process running in the CPU) or it is ready to run (it is runable and is waiting to be assigned to one of the CPUs).
(S) Seeping
The process is waiting for an event or for a resource; like waiting on a read or write to complete.
There are two types of SLEEPING processes, but that is beyond what we need to know.
(T) Stopped
The process has been stopped, usually by receiving a signal. A process that is being debugged can be in a stopped state.
(Z) Zombie
This is a halted process which, for some reason, still has entries in system tables. It is what it sounds like, a dead process.
Process types
User Processes
Most processes in the system are user processes. A user process is one that is initiated by a regular user account and runs in user space. Unless it is run in a way that gives the process special permissions, an ordinary user process has no special access to the processor or to files on the system that don't belong to the user who launched the process.
Daemon Process
A daemon process is an application that is designed to run in the background, typically managing some kind of ongoing service.
A daemon process might listen for an incoming request for access to a service. For example, the httpd daemon listens for requests to view web pages. Or a daemon might be intended to start activities at specific times.
Although daemon processes are typically managed as services by the root (think admin) user, daemon processes often run as non-root users by a user account that is dedicated to the service. By running daemons under different user accounts, a system is better protected in the event of an attack.
For example, if an attacker were to take over the httpd daemon (web server), which runs as the Apache user, it would give the attacker no special access to files owned by other users (including root) or other daemon processes.
Systems often start daemons at boot time and have them run continuously until the system is shut down. Daemons can also be started or stopped on demand.
Kernel Processes
Kernel processes execute only in kernel space. They are similar to daemon processes. The primary difference is that kernel processes have full access to kernel data structures, which makes them more powerful than daemon processes that run in user space.
Process scheduling
Uniprocessing vs multi-processing
The most precious resource in the system is the CPU. Linux is a multiprocessing operating system, its objective is to have a process running on each CPU in the system at all times, to maximize CPU utilization whilst not effecting user response time.
If there are more processes than CPUs (and there usually are), the rest of the processes must wait before a CPU becomes free until they can be run.
For example if we assume a single CPU system you have one process running in the CPU; it then requests a read of some data on disk. The process must wait until the read is completed. The question is - what should the CPU do whilst the process is waiting?
In a uniprocessing system, for example DOS, the CPU would simply sit idle and the waiting time would be wasted CPU time.
In a multiprocessing system many processes are kept in memory at the same time. Whenever a process has to wait, the operating system takes the CPU away from that process and gives it to another, more deserving process. this is called context switching.
It is the scheduler which chooses which is the most appropriate process to run next and Linux uses a number of scheduling strategies to ensure "fairness".
Context switch
From Tutorialspoint. A process is more than just code, its a virtual address space, and creating a process involves creating a new virtual address space.
Context switches are computationally intensive since register and memory state must be saved and restored. See the diagram on the right ====>
What is multithreading?
Multithreading is a technique that allows for concurrent (simultaneous) execution of two or more parts of a program for maximum utilization of a CPU. As a really basic example, multithreading allows you to write code in one program and listen to music in another. Programs are made up of processes and threads. You can think of it like this:
A program is an executable file like chrome.exe.
A process is an executing instance (copy) of a program. When you double click on the Google Chrome icon on your computer, you start a process which will run the Google Chrome program.
A thread is the smallest executable unit of a process. A process can have multiple threads with one main thread. In the example, a single thread could be displaying the current tab you’re in, and a different thread could be another tab.
Why use multithreading over multiple processes?
Creating a thread is much less expensive when compared to creating a new process, because the newly created thread uses the current process address space. The time it takes to switch between threads is much less than the time it takes to switch between processes, partly because switching between threads does not involve switching between address spaces.
Communicating between the threads of one process is simple because the threads share everything; address space, in particular. So, data produced by one thread is immediately available to all the other threads.
Example of multithreading
Think about a single processor that is running your IDE. Say you edit one of your code files and click save. When you click save, it will initiate a workflow which will cause bytes to be written out to the physical disk. However, reading or writing is an expensive operation (takes a long time), and the CPU will be idle while bytes are being written out to the disk.
While the writing takes place, the idle CPU could work on something useful and here is where threads come in - the write thread is switched out and the User Interface thread gets scheduled on the CPU so that if you click elsewhere on the screen, your IDE is still responsive and does not appear hung or frozen.
Threads can give the illusion of multitasking even though at any given point in time the CPU is executing only one thread.
Multi core CPU
With advances in hardware technology, it is now common to have multi-core machines. Applications can take advantage of these and have a dedicated CPU run each thread.
Most computers in 2018 are multi-core. Not all applications need multi-core and not all applications can access multi-core. As a general rule, any modern application will access the multi-core features of a Mac or PC if in fact it would make a positive difference to the user experience, e.g. by running faster, smoother, or with more data.
multi core CPU
multi core CPU running multithreaded process
Hyper threading
From pediaa.com: Hyper threading is a technology developed by Intel to increase the performance of the CPU/processor. It allows a single CPU to run two threads. On the other hand, multithreading is a mechanism that allows running multiple lightweight threads within a process at the same time. Each thread has their own program counter, stack, registers, etc.
It makes the operating system recognise each physical core as two virtual or logical cores. In other words, it virtually increases the number of cores in a CPU. Therefore, a single processor runs two threads. It is important to note that hyper threading really does not increase the number of cores – it just increases the cores virtually or logically. Each virtual core can work independently.