LINUX OS


Processes

What is a process? 

Taken mostly from devconnected and Redhat

A process is an instance (copy) of a computer program that is currently being executed. A process has a owner and is identified by a process ID (also called PID) .

On the other hand, programs are lines of code or lines of machine instructions stored on a permanent data storage (a disk or a USB). They can just reside on your data storage, or they can be in execution; then they are running as processes.

Each individual process runs in its own virtual address space and is not capable of interacting with another process except through secure, kernel managed mechanisms; the code of the LINUX Operating System.


The stack is a segment of memory where data like your local variables and function calls get added and/or removed 

During the lifetime of a process it will use many system resources:

Linux must keep track of the process itself and of the system resources that it has so that it can manage it and the other processes in the system fairly. It would not be fair to the other processes in the system if one process monopolized most of the system's physical memory or its CPUs. 

Processes life cycle

During the life of a process, it can go through different states. To help understand the various states of a process, compare a Linux process to a human being. Every human being has different stages of life. 

The life cycle begins with the parents giving birth to an offspring (synonymous to a process being forked by its parent process).

After birth, humans start living their lives in their surroundings and start using available resources for their survival (synonymous to a process being in a Running state).

At some point in the life cycle, humans need to wait for something that they must have before they can continue to the next step in their lives (this is synonymous to a process being in the Sleep state).

And just as every human life must come to an end, every process must also die at some point in time. 

Process states

As a process executes it changes state according to its circumstances. Linux processes have the following states:

The process is either running (it is the current process running in the CPU) or it is ready to run (it is runable and is waiting to be assigned to one of the CPUs). 

The process is waiting for an event or for a resource; like waiting on a read or write to complete.

There are two types of SLEEPING processes, but that is beyond what we need to know.

The process has been stopped, usually by receiving a signal. A process that is being debugged can be in a stopped state.

This is a halted process which, for some reason, still has entries in system tables. It is what it sounds like, a dead process.

Process types

User Processes

Most processes in the system are user processes. A user process is one that is initiated by a regular user account and runs in user space. Unless it is run in a way that gives the process special permissions, an ordinary user process has no special access to the processor or to files on the system that don't belong to the user who launched the process. 

Daemon Process

A daemon process is an application that is designed to run in the background, typically managing some kind of ongoing service. 

A daemon process might listen for an incoming request for access to a service. For example, the httpd daemon listens for requests to view web pages. Or a daemon might be intended to start activities at specific times.

Although daemon processes are typically managed as services by the root (think admin) user, daemon processes often run as non-root users by a user account that is dedicated to the service. By running daemons under different user accounts, a system is better protected in the event of an attack. 

For example, if an attacker were to take over the httpd daemon (web server), which runs as the Apache user, it would give the attacker no special access to files owned by other users (including root) or other daemon processes.

Systems often start daemons at boot time and have them run continuously until the system is shut down. Daemons can also be started or stopped on demand.

Kernel Processes 

Kernel processes execute only in kernel space. They are similar to daemon processes. The primary difference is that kernel processes have full access to kernel data structures, which makes them more powerful than daemon processes that run in user space. 

Process scheduling

Uniprocessing vs multi-processing

The most precious resource in the system is the CPU. Linux is a multiprocessing operating system, its objective is to have a process running on each CPU in the system at all times, to maximize CPU utilization whilst not effecting user response time.

If there are more processes than CPUs (and there usually are), the rest of the processes must wait before a CPU becomes free until they can be run. 

For example if we assume a single CPU system you have one process running in the CPU; it then requests a read of some data on disk. The process must wait until the read is completed. The question is - what should the CPU do whilst the process is waiting?

It is the scheduler which chooses which is the most appropriate process to run next and Linux uses a number of scheduling strategies to ensure "fairness". 

Context switch

From Tutorialspoint. A process is more than just code, its a virtual address space, and creating a process involves creating a new virtual address space. 

Context switches are computationally intensive since register and memory state must be saved and restored.  See the diagram on the right ====>

Multithreading   (taken from educative)

What is multithreading?

Multithreading is a technique that allows for concurrent (simultaneous) execution of two or more parts of a program for maximum utilization of a CPU. As a really basic example, multithreading allows you to write code in one program and listen to music in another. Programs are made up of processes and threads. You can think of it like this:

Why use multithreading over multiple processes?

Creating a thread is much less expensive when compared to creating a new process, because the newly created thread uses the current process address space. The time it takes to switch between threads is much less than the time it takes to switch between processes, partly because switching between threads does not involve switching between address spaces.

Communicating between the threads of one process is simple because the threads share everything; address space, in particular. So, data produced by one thread is immediately available to all the other threads.

Example of multithreading

Think about a single processor that is running your IDE. Say you edit one of your code files and click save. When you click save, it will initiate a workflow which will cause bytes to be written out to the physical disk. However, reading or writing is an expensive operation (takes a long time), and the CPU will be idle while bytes are being written out to the disk.

While the writing takes place, the idle CPU could work on something useful and here is where threads come in - the write thread is switched out and the User Interface thread gets scheduled on the CPU so that if you click elsewhere on the screen, your IDE is still responsive and does not appear hung or frozen.

Threads can give the illusion of multitasking even though at any given point in time the CPU is executing only one thread. 

Multi core CPU

With advances in hardware technology, it is now common to have multi-core machines. Applications can take advantage of these and have a dedicated CPU run each thread. 

Most computers in 2018 are multi-core. Not all applications need multi-core and not all applications can access multi-core. As a general rule, any modern application will access the multi-core features of a Mac or PC if in fact it would make a positive difference to the user experience, e.g. by running faster, smoother, or with more data. 

multi core CPU


multi core CPU running multithreaded process


Hyper threading

From pediaa.com:  Hyper threading is a technology developed by Intel to increase the performance of the CPU/processor. It allows a single CPU to run two threads. On the other hand, multithreading is a mechanism that allows running multiple lightweight threads within a process at the same time. Each thread has their own program counter, stack, registers, etc. 

 It makes the operating system recognise each physical core as two virtual or logical cores. In other words, it virtually increases the number of cores in a CPU. Therefore, a single processor runs two threads. It is important to note that hyper threading really does not increase the number of cores – it just increases the cores virtually or logically. Each virtual core can work independently. 

videos on multi core and multithreading

Htop


handy commands

htop

post 1st boot

check the OS version: $ cat /etc/os-release

update the packages:  sudo apt-get update

                                     sudo apt-get upgrade

install stress:               sudo apt-get install stress  

Virtual memory