16.1 Purposes of an Operating System

Files and Resources

Specification

There is a reasonably comprehensive website: TutorialsPoint which goes into lots of detail regarding various parts of the OS.  Remember to guide your learning by the specification, as there is too much information to take in otherwise.

Operating System and Resources

Why do we need an operating system? 

Why do we need an operating system? Why not have our programs interact directly with the hardware? They would run much better and not need as much ram.   Back in the dawn of computing they did just that. There was no operating system. Programs were written for a computer and those programs interacted directly with the hardware. They ran fast, they ran lean and they could only run on that machine.   

This is why an operating system is needed. It allows a program to run on multiple different types of hardware. An operating system provides a layer between the program and hardware to enable a program to use a standard interface no matter what hardware is used. The operating system translates the software calls to the hardware calls for that particular piece of hardware.   

Sure this slows things down a bit and uses more resources however the benefits are great. What is gained is increased flexibility and greater range of scope for a particular program. If it weren’t for the operating system then computers would not be as wide spread as they are today.

The operating system (e.g. Linux, Windows 10, OS X) is the software that maintains and runs the computer system. A more precise definition is: a set of programs that manage computer hardware resources and provide common services for application software.

OS were covered in Unit 1, but some of the responsibilities and tasks they perform include:

Remember, operating systems run two fundamental types of software:

Some facts relating to a basic understanding are:

A program is the written code, a static set of instructions waiting for execution.  A process is the dynamic executing code.  It is possible that a given program may run several process threads.

Modern operating systems have a variety of resources that are in demand by many programs/applications (resource scheduling).  The CPU, naturally, needs to be shared and processor scheduling (multi-tasking) is covered in the next section.  However, other resources could be storage, networking, etc.  The OS must ensure that each running program has [fair] access to these resources in the same way they get CPU access.  The OS acts as central command for all programs, fielding requests for printing, saving files, etc.  It would be disastrous, for example, if programs were allowed free reign over RAM.  Without controls, programs (especially badly written ones) could easily overwrite the memory area of other programs and even the OS.

Hiding hardware complexity

The Operating system is designed to hide the complexity of the PC's hardware. In order to achieve this it offers different features.

This shows that the operating system hides the hardware's complexity by providing an user friendly interface and controlling the system's resources so that the user has a better experience.

Arguably, the most important component of every modern consumer device operating system (dedicated systems and real-time operating systems, such as washing machines, nuclear missiles, etc.operate differently) is the kernel. The kernel is the protected core of the OS that directly communicates with the applications and hardware.  I say protected because the only way to communicate with the kernel is through the OS API.  I have included two examples below.  One from the Linux OS (Android variant) and one from Windows. Both show that the kernel is the core element of the OS.  It is the kernel that is loaded into memory first and remains in memory.

Image result for windows kernel

Diagram showing how various tasks and OS functionality is split into layers.  You do not need to memorise this information.

Image result for windows kernel

Users and applications do not see the hardware directly, but view it through the operating system. Applications and users view the hardware through the OS.This is used to hide certain hardware details from users and applications. Due to this abstraction, users cannot see changes in the hardware. Another way that abstraction can be used is to make related devices appear the same from the user point of view. For example, hard disks, USB flash drives, Blu-ray-ROMs are all very different media, but in many operating systems they appear the same to the user.  

it is especially important that programmers (who you must remember are also users of the OS) who design programs to run on a given OS are also protected from hardware complexities. Modern computer hardware is incredibly complex. Luckily, the operating system hides this complexity through the use of abstraction and high-level APIs. For example, if an application wants to create a file in a particular location, it orders the OS to create that file. The program doesn't need to be concerned with what filesystem the disk is running (FAT32, NTFS, etc), which disk it is, whether it's a network server or local drive. It is an abstraction, in that the detail is taken care of by the OS and the user is given a consistent interface, regardless of what specific hardware model is running the screen, for example. You could change your hard disk but the programs would still function because the OS makes sure that it presents an agreed interface to the programmer. The OS itself communicates with hardware through specific device drivers.

More detail on APIs if you're interested can be found in this video:

Processor Management

The operating system loads the written code (programs) from secondary storage into memory.  Once loaded, these programs become known as processes, which are of course dynamic in nature as they are running.  A process is simply a running program.

At the heart of an OS process is a process control block (PCB).  The CIE specification does not require this detail, but no real discussion of processes can take place without it. The CIE textbook goes into detail regarding the PCB and I have tried hard not to duplicate knowledge here.

The PCB is a data structure in the operating system kernel containing the information needed to manage a particular process. Wikipedia has a good article on these and I have incorporated extracts here. The PCB is the manifestation of a process in an operating system.

The role of the PCBs is central in process management: they are accessed and/or modified by most OS utilities, including those involved with scheduling, memory and I/O resource access and performance monitoring. It can be said that the set of the PCBs defines the current state of the operating system. For example, pointers to other PCBs inside a PCB allow the creation of those queues of processes in various scheduling states ("ready", "blocked", "running".) that are discussed below.

In a modern OS, the PCB is a very complex data structure with many items of data.  Of particular interest are items which are categorised as process control information, used by the OS to manage the process itself. These include: (there is no part of the specification which requires specific knowledge of these, but just be aware of some of the information held)

The memory used by processes is often stored within the kernel's protected stack.  This is because the PCBs hold crucial information about the running state of the computer and cannot be open to interference.

Scheduling (high- and low-level)

Three-level scheduling is a term to describe a method to more efficiently perform process scheduling that involves swapped out processes. 

Consider this problem: A system contains 50 running processes all with equal priority. However, the system's memory can only hold 10 processes in memory simultaneously. Therefore, there will always be 40 processes swapped out written on virtual memory on the hard disk. The time taken to swap out and swap in a process is 50 ms respectively (this is paging and described below).

With straightforward Round-robin scheduling, every time a context switch occurs (the switching between processes by the CPU), a process would need to be swapped in (because only the 10 least recently used processes are swapped in). Choosing randomly among the processes would diminish the probability to 80% (40/50). If that occurs, then obviously a process also need to be swapped out. Swapping in and out of is costly, and the scheduler would waste much of its time doing unneeded swaps.

That is where three-level scheduling enters the picture. It uses three different schedulers: one lower-level scheduler (or short-term scheduler) which can only select among those processes in memory to run (in their ready state). A mid-level scheduler that manages virtual memory swapping. The final scheduler is the higher-level scheduler (long-term scheduler) whose only concern is to load in and out processes from memory (to/from the backing store). When you investigate process states further down, the high-level manages the admitted and terminated states.

Think about scheduling like an exclusive nightclub bar. The low-level scheduler decides who will get served amongst those waiting (those in the ready state),  The high-level scheduler decides who is allowed into the club to begin with.  The medium-level scheduler will manage numbers when there is more demand than capacity. It will temporarily remove some people from the queue (depending on priority, etc) and let in others who are more important or have not yet been served.  The medium-level scheduler might also remove a client (temporarily) if they don't know what they want, or can't find their money.

On most systems, there will be many more processes running than there are CPUs or CPU cores. The scheduler determines which job gets to use the CPU at what time.  The job of the OS low-level task scheduler is to ensure efficient time is being made of the CPU.  The rest of this section will discuss low-level scheduling.

Scheduling Methods

in 9618, you now need to know several scheduling techniques.  Note there is no one "good" technique and all have their benefits and drawbacks.  There are two main types of scheduling:  Pre-emptive or non pre-emptive.  Pre-emptive makes use of process priorities to determine which tasks get scheduled, non-pre-emptive scheduling is where a process is not interrupted until its life-cycle is complete.  CAIE do not make reference to pre-emptive or non-preemptive, but they are important concepts when looking at scheduling algorithms


Key Terms

CPU Burst time refers to the time required in milli seconds by a process for its execution. The Burst Time takes into consideration the CPU time of a process. The I/O time is not taken into consideration. It is called as the execution time or running time of the process before terminating or reading a blocked state. Therefore, you can take burst time to be how long a process could run before terminating or running IO operation.

Preemptive Scheduling is a CPU scheduling technique that works by dividing time slots of CPU to a given process. The time slot given might be able to complete the whole process or might not be able to finish in time. When the CPU burst time of the process is greater than CPU cycle (time) given to it, it is placed back into the ready queue and will execute in the next chance. Preemptive scheduling is one which can be done in the circumstances when a process switches from running state to ready state or from waiting state to ready state. Here, the resources (CPU cycles) are allocated to the process for a limited amount of time and then is taken away, and the process is placed back in the ready queue again if it still has CPU burst time remaining. The process stays in the ready queue till it gets its next chance to execute.  If a process with high priority arrives in the ready queue, it does not have to wait for the current process to complete its burst time. Instead, the current process is interrupted in the middle of execution and is placed in the ready queue till the process with high priority is utilizing the CPU cycles. In this way, each process in the ready queue gets some time to run CPU. It makes the preemptive scheduling flexible but, increases the overhead of switching the process from running state to ready state and vice-versa.

Non-Preemptive scheduling In this scheduling, once the resources (CPU cycles) are allocated to a process, the process holds the CPU till it gets terminated or reaches a waiting state. Unlike preemptive scheduling, non-preemptive scheduling does not interrupt a process running CPU in the middle of the execution. Instead, it waits for the process to complete its CPU burst time and then it can allocate the CPU to another process.

Key differences between preemptive and non-preemptive scheduling (taken from GeekForGeeks):

Scheduling Algorithms

Useful websites:

Round Robbin

Round robin is the oldest, simplest scheduling algorithm. The name of this algorithm comes from the round-robin principle, where each person gets an equal share of something in turn. It is mostly used for scheduling algorithms in multitasking. This algorithm method helps for starvation free execution of processes.  This is a preemptive scheduling algorithm as each process is given a capped burst time before being placed back in the queue. 

Characteristics of Round-Robin Scheduling

Shortest job first (SJF)

SJF is a full form of (Shortest job first) is a scheduling algorithm in which the process with the shortest execution time should be selected for execution next. This scheduling method is considered non-preemptive (although there are preemptive versions). It significantly reduces the average waiting time for other processes awaiting execution.

Characteristics of SJF Scheduling

The main drawback of SJF Scheduling is that it can not be implemented practically. This is because burst time of the processes can not be known in advance. Burst times can sometimes be determined by the process size, but this isn't an accurate metric.

First come first served (FCFS)

First Come First Serve is the full form of FCFS. It is the easiest and most simple CPU scheduling algorithm. In this type of algorithm, the process which requests the CPU gets the CPU allocation first. This scheduling method can be managed with a FIFO queue.  It is non-preemptive meaning a process which arrives early but has a long burst time (time before termination or IO) could starve other processes. It does, however, offer the most optimal response time as context switching is minimised.  The only criteria for being run is to arrive before another process, irrespective of priority or execution time.

As the process enters the ready queue, its PCB (Process Control Block) is linked with the tail of the queue. So, when CPU becomes free, it should be assigned to the process at the beginning of the queue.

Characteristics of FCFS method:

Shortest remaining time (SRT)

The full form of SRT is Shortest remaining time. It is also known as SJF preemptive scheduling. In this method, the process will be allocated to the task, which is closest to its completion. This method prevents a newer ready state process from holding the completion of an older process. The algorithm sorts the run queue by the the process’ anticipated CPU burst time, picking the shortest burst time. Doing so will optimise the average response time of processes.  The biggest problem with sorting processes this way is that we’re trying to optimize our schedule using data that we don’t even have! We don’t know what the CPU burst time will be for a process when it’s next run. It might immediately request I/O or it might continue running for minutes (or until the expiration of its time slice).  The best that we can do is guess and try to predict the next CPU burst time by assuming that it will be related to past CPU bursts for that process.   

Characteristics of SRT scheduling method:

Remember, that even the task scheduler needs the CPU to run. If there is a single CPU core, the OS must give way to running various processes, but how can the OS take back control of the CPU once ceded to a given task?  What will stop the task from hogging the CPU constantly? Interrupts! These are covered later in more depth, but the scheduler will set up a hardware timer (called the programmable interval timer: PIT) that will generate an interrupt every n milliseconds.  From this, the interrupt is delivered to the kernel and the code is interrupted.  This is transparent to the process and thus, program code does not need to be written to ensure CPU time sharing.  On multi-core systems, the task scheduler can trigger an interrupt on other cores.

Process scheduling is also needed to be very efficient.  Much time can be wasted by the rate of input, rather than speed of CPU. The scheduler must pick the right process to run, making efficient use of the CPU and minimising process switching (which has a cost as the state (registers, etc.) must be saved and new process loaded.  Processes can be given a priority level, with real-time being the highest priority.  The task scheduler manages the various states of processes. 

Process states

Image result for process states difference between waiting blocked and ready

Remember that the state of a process between Ready --> Running (back to Ready/ Blocked --> Ready) is handled by the low-level scheduler.  The low-level scheduler will ONLY move a process from Ready to Running.  The scheduler will ONLY run programs that are in their ready state and no other state.  Thus, a process that is Blocked /Waiting will never go back to Running without first entering the Ready state.

Deadlocks

A deadlock is a situation in which two computer programs sharing the same resource are effectively preventing each other from accessing the resource, resulting in both programs ceasing to function.

The earliest computer operating systems ran only one program at a time. All of the resources of the system were available to this one program. Later, operating systems ran multiple programs at once, interleaving them. Programs were required to specify in advance what resources they needed so that they could avoid conflicts with other programs running at the same time. Eventually some operating systems offered dynamic allocation of resources. Programs could request further allocations of resources after they had begun running. This led to the problem of the deadlock. Here is the simplest example:

Program 1 requests resource A and receives it.

Program 2 requests resource B and receives it.

Program 1 requests resource B and is queued up, pending the release of B.

Program 2 requests resource A and is queued up, pending the release of A.

Now neither program can proceed until the other program releases a resource. The operating system cannot know what action to take. At this point the only alternative is to abort (stop) one of the programs.

Learning to deal with deadlocks had a major impact on the development of operating systems.

Interrupts

Interrupts were covered last year, but in summary, an interrupt handler (aka interrupt service routine (ISR)) is a function triggered by the reception of an interrupt signal.  These can be software generated or hardware generated. Each interrupt is given a certain priority and higher priority interrupts will always take precedence over low level interrupts.

Software Generated

Software interrupts can be triggered by specific calls which when run will generate a given interrupt (deliberate), or, when execution cannot proceed due to a condition which has not been handled by the software, such as divide by 0 error thrown by the ALU, called an exception error.

Hardware Generated

These are used by all hardware devices and used to communicate that they require the attention of the OS. Each time you press a key or move the mouse, an interrupt will be generated and handled appropriately.  Hardware interrupts are very common and a standard way of communicating attention.

Each interrupt has its own interrupt handler.  This is the sequence of instructions run when the interrupt is thrown. Software interrupts can be unlimited, but hardware interrupts are limited by the number of physical lines (IRQ) to the processor.

What Handles the Interrupts?

The OS kernel is what manages the interrupts.  

When an interrupt is detected, the kernel will:

The above list forms the basis, with explanation, of CIE's definition.  In reality, the contents of the various registers are saved to the PCB, which is found in the kernel stack memory.

As discussed above in more detail, the task scheduling is managed by interrupts acting as a timer, allowing a certain amount of time to be given to a particular process before the ISR puts the running process back to ready and runs another process.

links

The Kernel

Memory Management

There are two main methods for memory management: partitioned memory (also called segmenting), and paged memory.  There is a third, called program segmentation, but this will not form part of the main discussion as this is a method employed by a programmer, not the OS.  Paged memory is far more common (and efficient), so this section mostly discusses this.

Some facts relating to a basic understanding are:

Allocating Space (memory management). 

The kernel is responsible for allocating memory space to programs and data, and recovering the memory when it is no longer needed. When you open a program, the kernel decides where it is going to put it in your RAM so that it doesn't interfere with other programs. You may have lots of programs running at the same time and lots of data files open. They must all be allocated space so that they can work without corrupting the other applications or data files, and the kernel is responsible for this, and keeping a record of where everything is. When you close a program, the kernel 'reclaims' the memory that the program used by updating its records. That space can then be allocated to another program when needed. If any programs are not running correctly, their access to memory is ring-fenced, so they can't interfere with memory locations not allocated to it and thereby memory is protected.  The section of the kernel responsible for this is the memory management unit.  There are three main memory management strategies when dealing with multi-tasking operating systems:

Memory Partitioning  and segmenting

Memory segmentation is the division of a computer's primary memory into continuous segments or sections of a fixed or variable size. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment.  Segmenting can be handled in different ways by different operating systems.  For example, some OS will allocate blocks to different functions and schedule programs to load when there are enough segments available to load the whole program. Segmenting and paging (described next) are VERY similar concepts, but the method in which they manage the block sizes is different.  Paging is small equal sized blocks, segments can vary in size.

More can be read here: https://study.com/academy/lesson/what-is-memory-partitioning-definition-concept.html

At the risk of oversimplification:

Paging & Virtual Memory

Paging is a very similar concept to segmenting but instead programs are divided up into pages (each have the same fixed size).  As programs are split up into pages, memory is split into equally sized page frames.  This is what paging means.  Rather than a process being given a range of memory, it is instead assigned to a number of pages.  pages do not have to be loaded into contiguous page frames, or even in order.  A page table keeps track of all the processes' pages and which page frame they are loaded into.

The difference between paging and virtual memory, is the ability to use secondary storage to hold these pages, meaning there is an imaginary increase in memory than is really present.  The extra memory, created by use of secondary storage, is virtual.  In virtual memory, the page table is used to translate the virtual memory location (of the page) to the physical location in memory or on secondary storage, as described next.  

Paging is more useful because pages can be swapped in and out of memory and stored on secondary storage when they are not currently being used.  This is the concept of virtual memory and the physical file on the hard disk called a page file.  The page file is divided up into logical units the same size as memory pages.  You should notice that the page file is nearly always at least the same size as RAM, although modern operating systems take many factors into account when left to automatically manage the page file.

When a process tries to access a page of memory that that has is mapped in the virtual address space, but not actually loaded into main memory, an interrupt called a page fault is triggered.  The memory scheduler (high level scheduler) will then run and ensure that the necessary page is then loaded into a page frame (and other pages swapped out if needed).

Imagine you have 1 GB of RAM (unrealistic, but we'll go with it for simple purposes).  Your OS already takes up 256MB leaving 768MB RAM.  You load a spreadsheet, web browser and computer game which uses up all the RAM.  You then want to load a PDF, for which there is no physical memory left.  The memory management unit will determine which pages are not that instant being used and moves them from RAM to the page file.  The page table is updated to indicate that that page is now on secondary storage.  Enough pages are cleared and you can now load your PDF.  Now imagine you switch back to your spreadsheet, the page table might indicate that part of the program is not in RAM, but in the page file.  The memory management will repeat the process, removing pages to the secondary storage to free up enough ram to reload those previous pages.

This is the main reason why more RAM can speed up your computer.  If you notice that your hard disk is constantly reading and writing, it could be the virtual memory swapping many pages back and fourth.

The Purpose of the Page File or Swap Partition in simple terms (from HowToGeek)

The purpose of the page file on Windows or swap partition on Linux is to provide additional working memory to your computer. For example, if your computer has 2 GB of RAM and you open a large number of programs or large number of files, your computer might need to store 3 GB of data in its working memory. The computer stores that additional 1 GB of data in its page file or swap space. The page file or swap acts as an “overflow” area to hold the additional data. Your computer automatically transfers data back to its RAM when it’s being used, and moves data to its page file or swap partition when it’s not being used.

If you used an older desktop computer, you could see this happen after you minimised a desktop program for a while. When you maximised it later, it would take a while to appear, and you’d hear your hard drive grinding away while that disk activity LED flashed — its data was being moved back from your page file or swap partition to its RAM. The RAM is much faster than the page file or swap partition. (This is much less common on modern computers that have sufficient amounts of RAM to keep desktop programs in RAM.)

Most applications expect to get the memory they request. If your RAM was full and you had no page file, and then you opened another program, the program would likely crash. Having a page file with additional space programs can use prevents this from happening.

Management Algorithms

When a page needs to be allocated a page frame, often that frame will need to be overwritten. What, therefore, should be swapped out?  There are different algorithms employed by the memory management.  For example, it could replace a page that is least used, by recording in the page table, the time it was loaded into memory (not how long, as that would require constant updating).  Another method could be to use the longest resident.  Both of these have advantages and drawbacks.  For example, using longest resident, a page may be used frequently.  Using least used could be problematic for pages recently loaded, which may, by the length of time in RAM, have low access.  A balance of methods is often required.

Useful websites:

Program Segmentation

Segmentation allows you to break a program into functions, each of which can be loaded into memory only when required.  Take a modern word processing program, that contains millions of lines of code.  The programmer will break that into useful functions such as spell checking, printing, mail merge, etc.  Only when you need to use that particular part of the program is it then loaded into RAM.  Segments can of course be of varying sizes, depending on the code that needs to be executed.  

Technically, this is a memory management technique, but one managed by the programmer, not the operating system.

This video, while discussing Android and iOS, does go into virtual memory, which may add to your understanding.

Links

A very good article on virtual memory and its need (from a Windows perspective) can be found here.