16.1 Purposes of an Operating System

Operating System and Resources

Why do we need an operating system?

Hiding hardware complexity

Processor Management

Scheduling (high- and low-level)

Scheduling Methods

Scheduling Algorithms

Allocating Space (memory management).

Management Algorithms

Links

Files and Resources

Specification

show understanding of how an OS can maximise the use of resources
describe the ways in which the user interface hides the complexities of the hardware from the user
show understanding of process management:
- the concept of multitasking and a process
- the process states: running, ready and blocked
- the need for scheduling and the function and benefits of different scheduling routines (including round robbin, shortest job first, first come first served, shortest remaining time)
- the concept of an interrupt
- how the kernel of the OS acts as the interrupt handler and how interrupt handling is used to manage low-level scheduling
show understanding of virtual memory, paging and segmentation for memory management:
- the concepts of paging and virtual memory
- The difference between paging and segmentation
- how pages can be replaced
- how disk thrashing can occur

There is a reasonably comprehensive website: TutorialsPoint which goes into lots of detail regarding various parts of the OS. Remember to guide your learning by the specification, as there is too much information to take in otherwise.

Operating System and Resources

Why do we need an operating system?

Why do we need an operating system? Why not have our programs interact directly with the hardware? They would run much better and not need as much ram. Back in the dawn of computing they did just that. There was no operating system. Programs were written for a computer and those programs interacted directly with the hardware. They ran fast, they ran lean and they could only run on that machine.

This is why an operating system is needed. It allows a program to run on multiple different types of hardware. An operating system provides a layer between the program and hardware to enable a program to use a standard interface no matter what hardware is used. The operating system translates the software calls to the hardware calls for that particular piece of hardware.

Sure this slows things down a bit and uses more resources however the benefits are great. What is gained is increased flexibility and greater range of scope for a particular program. If it weren’t for the operating system then computers would not be as wide spread as they are today.

The operating system (e.g. Linux, Windows 10, OS X) is the software that maintains and runs the computer system. A more precise definition is: a set of programs that manage computer hardware resources and provide common services for application software.

OS were covered in Unit 1, but some of the responsibilities and tasks they perform include:

- Hiding the complexities of hardware from the user.
- Managing between the hardware's resources which include the processors, memory, data storage and I/O devices.
- Handling interrupts generated by the I/O controllers.
- Handling interrupts generated by software applications and utilities
- Sharing the CPU time amongst running programs (task scheduling)
- Sharing of I/O between many programs using the CPU (e.g. keyboard, mouse, hard disk, USB ports).
- Allocation of computer memory
- Provide common API (application programming interface) to allow programs to use resources such as the screen, keyboard, storage, network, etc)
  - Putting commonly used operations such as saving data into the operating system also reduces the amount of work that a person writing a new application must do.

Remember, operating systems run two fundamental types of software:

- Application (software dedicated towards a specific task. E.g. Word processing, spreadsheets, web browsing, games)
- System (software that manage the operation of the computer. OS are a type of system software, but others include disk management, antivirus, fragmentation, etc.)

Some facts relating to a basic understanding are:

- Resource management primarily concerns the CPU, the memory and the I/O (Input Output) system.
- The I/O system is a broad term covering all aspects of data supplied to the CPU and collected from the CPU.
- All I/O operations are very slow compared to the speed at which the CPU works.
- The operating system structure is crucial in hiding the complexities of the hardware from the user.
- This structure ensures that the operating system has a user mode and a kernel mode; the kernel mode is in control of the interface with the hardware.

A program is the written code, a static set of instructions waiting for execution. A process is the dynamic executing code. It is possible that a given program may run several process threads.

Modern operating systems have a variety of resources that are in demand by many programs/applications (resource scheduling). The CPU, naturally, needs to be shared and processor scheduling (multi-tasking) is covered in the next section. However, other resources could be storage, networking, etc. The OS must ensure that each running program has [fair] access to these resources in the same way they get CPU access. The OS acts as central command for all programs, fielding requests for printing, saving files, etc. It would be disastrous, for example, if programs were allowed free reign over RAM. Without controls, programs (especially badly written ones) could easily overwrite the memory area of other programs and even the OS.

Hiding hardware complexity

The Operating system is designed to hide the complexity of the PC's hardware. In order to achieve this it offers different features.

- Provides an interface between computer and the user
- Input and output devices can be controlled
- The system's recources are managed by the OS and can also be easily managed by the user using the OS.
- Reads and writes from or to a disk, also providing an interface for the user to manage the files
- Controls user access rights, by restricting different actions or interfaces

This shows that the operating system hides the hardware's complexity by providing an user friendly interface and controlling the system's resources so that the user has a better experience.

Arguably, the most important component of every modern consumer device operating system (dedicated systems and real-time operating systems, such as washing machines, nuclear missiles, etc.operate differently) is the kernel. The kernel is the protected core of the OS that directly communicates with the applications and hardware. I say protected because the only way to communicate with the kernel is through the OS API. I have included two examples below. One from the Linux OS (Android variant) and one from Windows. Both show that the kernel is the core element of the OS. It is the kernel that is loaded into memory first and remains in memory.

Diagram showing how various tasks and OS functionality is split into layers. You do not need to memorise this information.

Users and applications do not see the hardware directly, but view it through the operating system. Applications and users view the hardware through the OS.This is used to hide certain hardware details from users and applications. Due to this abstraction, users cannot see changes in the hardware. Another way that abstraction can be used is to make related devices appear the same from the user point of view. For example, hard disks, USB flash drives, Blu-ray-ROMs are all very different media, but in many operating systems they appear the same to the user.

it is especially important that programmers (who you must remember are also users of the OS) who design programs to run on a given OS are also protected from hardware complexities. Modern computer hardware is incredibly complex. Luckily, the operating system hides this complexity through the use of abstraction and high-level APIs. For example, if an application wants to create a file in a particular location, it orders the OS to create that file. The program doesn't need to be concerned with what filesystem the disk is running (FAT32, NTFS, etc), which disk it is, whether it's a network server or local drive. It is an abstraction, in that the detail is taken care of by the OS and the user is given a consistent interface, regardless of what specific hardware model is running the screen, for example. You could change your hard disk but the programs would still function because the OS makes sure that it presents an agreed interface to the programmer. The OS itself communicates with hardware through specific device drivers.

More detail on APIs if you're interested can be found in this video:

Processor Management

The operating system loads the written code (programs) from secondary storage into memory. Once loaded, these programs become known as processes, which are of course dynamic in nature as they are running. A process is simply a running program.

At the heart of an OS process is a process control block (PCB). The CIE specification does not require this detail, but no real discussion of processes can take place without it. The CIE textbook goes into detail regarding the PCB and I have tried hard not to duplicate knowledge here.

The PCB is a data structure in the operating system kernel containing the information needed to manage a particular process. Wikipedia has a good article on these and I have incorporated extracts here. The PCB is the manifestation of a process in an operating system.

The role of the PCBs is central in process management: they are accessed and/or modified by most OS utilities, including those involved with scheduling, memory and I/O resource access and performance monitoring. It can be said that the set of the PCBs defines the current state of the operating system. For example, pointers to other PCBs inside a PCB allow the creation of those queues of processes in various scheduling states ("ready", "blocked", "running".) that are discussed below.

In a modern OS, the PCB is a very complex data structure with many items of data. Of particular interest are items which are categorised as process control information, used by the OS to manage the process itself. These include: (there is no part of the specification which requires specific knowledge of these, but just be aware of some of the information held)

- The process scheduling state, e.g. in terms of "ready", "suspended", etc.,
- Process state: State may enter into new, ready, running, waiting, dead depending on CPU scheduling.
- Process No: a unique identification number for each process in the operating system.
- Program counter: a pointer to the address of the next instruction to be executed for this process.
- CPU registers: indicates various register set of CPU where process need to be stored for execution for running state.
- Memory management information: includes the information of page table, memory limits, Segment table depending on memory used by the operating system.

The memory used by processes is often stored within the kernel's protected stack. This is because the PCBs hold crucial information about the running state of the computer and cannot be open to interference.

Scheduling (high- and low-level)

Three-level scheduling is a term to describe a method to more efficiently perform process scheduling that involves swapped out processes.

Consider this problem: A system contains 50 running processes all with equal priority. However, the system's memory can only hold 10 processes in memory simultaneously. Therefore, there will always be 40 processes swapped out written on virtual memory on the hard disk. The time taken to swap out and swap in a process is 50 ms respectively (this is paging and described below).

With straightforward Round-robin scheduling, every time a context switch occurs (the switching between processes by the CPU), a process would need to be swapped in (because only the 10 least recently used processes are swapped in). Choosing randomly among the processes would diminish the probability to 80% (40/50). If that occurs, then obviously a process also need to be swapped out. Swapping in and out of is costly, and the scheduler would waste much of its time doing unneeded swaps.

That is where three-level scheduling enters the picture. It uses three different schedulers: one lower-level scheduler (or short-term scheduler) which can only select among those processes in memory to run (in their ready state). A mid-level scheduler that manages virtual memory swapping. The final scheduler is the higher-level scheduler (long-term scheduler) whose only concern is to load in and out processes from memory (to/from the backing store). When you investigate process states further down, the high-level manages the admitted and terminated states.

Think about scheduling like an exclusive nightclub bar. The low-level scheduler decides who will get served amongst those waiting (those in the ready state), The high-level scheduler decides who is allowed into the club to begin with. The medium-level scheduler will manage numbers when there is more demand than capacity. It will temporarily remove some people from the queue (depending on priority, etc) and let in others who are more important or have not yet been served. The medium-level scheduler might also remove a client (temporarily) if they don't know what they want, or can't find their money.

On most systems, there will be many more processes running than there are CPUs or CPU cores. The scheduler determines which job gets to use the CPU at what time. The job of the OS low-level task scheduler is to ensure efficient time is being made of the CPU. The rest of this section will discuss low-level scheduling.

Scheduling Methods

in 9618, you now need to know several scheduling techniques. Note there is no one "good" technique and all have their benefits and drawbacks. There are two main types of scheduling: Pre-emptive or non pre-emptive. Pre-emptive makes use of process priorities to determine which tasks get scheduled, non-pre-emptive scheduling is where a process is not interrupted until its life-cycle is complete. CAIE do not make reference to pre-emptive or non-preemptive, but they are important concepts when looking at scheduling algorithms

Key Terms

CPU Burst time refers to the time required in milli seconds by a process for its execution. The Burst Time takes into consideration the CPU time of a process. The I/O time is not taken into consideration. It is called as the execution time or running time of the process before terminating or reading a blocked state. Therefore, you can take burst time to be how long a process could run before terminating or running IO operation.

Preemptive Scheduling is a CPU scheduling technique that works by dividing time slots of CPU to a given process. The time slot given might be able to complete the whole process or might not be able to finish in time. When the CPU burst time of the process is greater than CPU cycle (time) given to it, it is placed back into the ready queue and will execute in the next chance. Preemptive scheduling is one which can be done in the circumstances when a process switches from running state to ready state or from waiting state to ready state. Here, the resources (CPU cycles) are allocated to the process for a limited amount of time and then is taken away, and the process is placed back in the ready queue again if it still has CPU burst time remaining. The process stays in the ready queue till it gets its next chance to execute. If a process with high priority arrives in the ready queue, it does not have to wait for the current process to complete its burst time. Instead, the current process is interrupted in the middle of execution and is placed in the ready queue till the process with high priority is utilizing the CPU cycles. In this way, each process in the ready queue gets some time to run CPU. It makes the preemptive scheduling flexible but, increases the overhead of switching the process from running state to ready state and vice-versa.

Non-Preemptive scheduling In this scheduling, once the resources (CPU cycles) are allocated to a process, the process holds the CPU till it gets terminated or reaches a waiting state. Unlike preemptive scheduling, non-preemptive scheduling does not interrupt a process running CPU in the middle of the execution. Instead, it waits for the process to complete its CPU burst time and then it can allocate the CPU to another process.

Key differences between preemptive and non-preemptive scheduling (taken from GeekForGeeks):

In preemptive scheduling, the CPU is allocated to the processes for a limited time whereas, in Non-preemptive scheduling, the CPU is allocated to the process till it terminates or switches to the waiting state.
The executing process in preemptive scheduling is interrupted in the middle of execution when higher priority one comes whereas, the executing process in non-preemptive scheduling is not interrupted in the middle of execution and waits till its execution.
In Preemptive Scheduling, there is the overhead of switching the process from the ready state to running state, vise-verse and maintaining the ready queue. Whereas in the case of non-preemptive scheduling has no overhead of switching the process from running state to ready state.
In preemptive scheduling, if a high-priority process frequently arrives in the ready queue then the process with low priority has to wait for a long, and it may have to starve. , in the non-preemptive scheduling, if CPU is allocated to the process having a larger burst time then the processes with small burst time may have to starve.
Preemptive scheduling attains flexibility by allowing the critical processes to access the CPU as they arrive into the ready queue, no matter what process is executing currently. Non-preemptive scheduling is called rigid as even if a critical process enters the ready queue the process running CPU is not disturbed.
Preemptive Scheduling has to maintain the integrity of shared data that’s why it is cost associative which is not the case with Non-preemptive Scheduling. Shared data is where one process is responsible for updating data than another process may then wish to use. Should this not be complete, corruption could occur.

Scheduling Algorithms

Useful websites:

Round Robbin

Round robin is the oldest, simplest scheduling algorithm. The name of this algorithm comes from the round-robin principle, where each person gets an equal share of something in turn. It is mostly used for scheduling algorithms in multitasking. This algorithm method helps for starvation free execution of processes. This is a preemptive scheduling algorithm as each process is given a capped burst time before being placed back in the queue.

Characteristics of Round-Robin Scheduling

Round robin is a hybrid model which is clock-driven
Time slice should be minimum, which is assigned for a specific task to be processed. However, it may vary for different processes.
It is a real time system which responds to the event within a specific time limit.

Shortest job first (SJF)

SJF is a full form of (Shortest job first) is a scheduling algorithm in which the process with the shortest execution time should be selected for execution next. This scheduling method is considered non-preemptive (although there are preemptive versions). It significantly reduces the average waiting time for other processes awaiting execution.

Characteristics of SJF Scheduling

It is associated with each job as a unit of time to complete.
In this method, when the CPU is available, the next process or job with the shortest completion time will be executed first.
It is Implemented with non-preemptive policy.
This algorithm method is useful for batch-type processing, where waiting for jobs to complete is not critical.
It improves job output by offering shorter jobs, which should be executed first, which mostly have a shorter turnaround time.

The main drawback of SJF Scheduling is that it can not be implemented practically. This is because burst time of the processes can not be known in advance. Burst times can sometimes be determined by the process size, but this isn't an accurate metric.

First come first served (FCFS)

First Come First Serve is the full form of FCFS. It is the easiest and most simple CPU scheduling algorithm. In this type of algorithm, the process which requests the CPU gets the CPU allocation first. This scheduling method can be managed with a FIFO queue. It is non-preemptive meaning a process which arrives early but has a long burst time (time before termination or IO) could starve other processes. It does, however, offer the most optimal response time as context switching is minimised. The only criteria for being run is to arrive before another process, irrespective of priority or execution time.

As the process enters the ready queue, its PCB (Process Control Block) is linked with the tail of the queue. So, when CPU becomes free, it should be assigned to the process at the beginning of the queue.

Characteristics of FCFS method:

It is a non-preemptive version of the preemptive round robin scheduling algorithm.
Jobs are always executed on a first-come, first-serve basis
It is easy to implement and use.
However, this method is poor in performance, and the average wait time is quite high.

Shortest remaining time (SRT)

The full form of SRT is Shortest remaining time. It is also known as SJF preemptive scheduling. In this method, the process will be allocated to the task, which is closest to its completion. This method prevents a newer ready state process from holding the completion of an older process. The algorithm sorts the run queue by the the process’ anticipated CPU burst time, picking the shortest burst time. Doing so will optimise the average response time of processes. The biggest problem with sorting processes this way is that we’re trying to optimize our schedule using data that we don’t even have! We don’t know what the CPU burst time will be for a process when it’s next run. It might immediately request I/O or it might continue running for minutes (or until the expiration of its time slice). The best that we can do is guess and try to predict the next CPU burst time by assuming that it will be related to past CPU bursts for that process.

Characteristics of SRT scheduling method:

This method is mostly applied in batch environments where short jobs are required to be given preference.
This is not an ideal method to implement it in a shared system where the required CPU time is unknown.
Associate with each process as the length of its next CPU burst. So that operating system uses these lengths, which helps to schedule the process with the shortest possible time.

Remember, that even the task scheduler needs the CPU to run. If there is a single CPU core, the OS must give way to running various processes, but how can the OS take back control of the CPU once ceded to a given task? What will stop the task from hogging the CPU constantly? Interrupts! These are covered later in more depth, but the scheduler will set up a hardware timer (called the programmable interval timer: PIT) that will generate an interrupt every n milliseconds. From this, the interrupt is delivered to the kernel and the code is interrupted. This is transparent to the process and thus, program code does not need to be written to ensure CPU time sharing. On multi-core systems, the task scheduler can trigger an interrupt on other cores.

Process scheduling is also needed to be very efficient. Much time can be wasted by the rate of input, rather than speed of CPU. The scheduler must pick the right process to run, making efficient use of the CPU and minimising process switching (which has a cost as the state (registers, etc.) must be saved and new process loaded. Processes can be given a priority level, with real-time being the highest priority. The task scheduler manages the various states of processes.

Process states

Remember that the state of a process between Ready --> Running (back to Ready/ Blocked --> Ready) is handled by the low-level scheduler. The low-level scheduler will ONLY move a process from Ready to Running. The scheduler will ONLY run programs that are in their ready state and no other state. Thus, a process that is Blocked /Waiting will never go back to Running without first entering the Ready state.

Deadlocks

A deadlock is a situation in which two computer programs sharing the same resource are effectively preventing each other from accessing the resource, resulting in both programs ceasing to function.

The earliest computer operating systems ran only one program at a time. All of the resources of the system were available to this one program. Later, operating systems ran multiple programs at once, interleaving them. Programs were required to specify in advance what resources they needed so that they could avoid conflicts with other programs running at the same time. Eventually some operating systems offered dynamic allocation of resources. Programs could request further allocations of resources after they had begun running. This led to the problem of the deadlock. Here is the simplest example:

Program 1 requests resource A and receives it.

Program 2 requests resource B and receives it.

Program 1 requests resource B and is queued up, pending the release of B.

Program 2 requests resource A and is queued up, pending the release of A.

Now neither program can proceed until the other program releases a resource. The operating system cannot know what action to take. At this point the only alternative is to abort (stop) one of the programs.

Learning to deal with deadlocks had a major impact on the development of operating systems.

Interrupts

Interrupts were covered last year, but in summary, an interrupt handler (aka interrupt service routine (ISR)) is a function triggered by the reception of an interrupt signal. These can be software generated or hardware generated. Each interrupt is given a certain priority and higher priority interrupts will always take precedence over low level interrupts.

Software Generated

Software interrupts can be triggered by specific calls which when run will generate a given interrupt (deliberate), or, when execution cannot proceed due to a condition which has not been handled by the software, such as divide by 0 error thrown by the ALU, called an exception error.

Hardware Generated

These are used by all hardware devices and used to communicate that they require the attention of the OS. Each time you press a key or move the mouse, an interrupt will be generated and handled appropriately. Hardware interrupts are very common and a standard way of communicating attention.

Each interrupt has its own interrupt handler. This is the sequence of instructions run when the interrupt is thrown. Software interrupts can be unlimited, but hardware interrupts are limited by the number of physical lines (IRQ) to the processor.

What Handles the Interrupts?

The OS kernel is what manages the interrupts.

When an interrupt is detected, the kernel will:

1. identify the source of the interrupt
2. disable all interrupts of a lower priority (to prevent the ISR process being interrupted by less important interrupts)
3. save the contents of the PC and the other registers onto the stack (often in the protected kernel stack memory)
4. load and run the appropriate ISR code
5. restore the registers from the stack
6. enable all interrupts
7. continue execution of the interrupted process

The above list forms the basis, with explanation, of CIE's definition. In reality, the contents of the various registers are saved to the PCB, which is found in the kernel stack memory.

As discussed above in more detail, the task scheduling is managed by interrupts acting as a timer, allowing a certain amount of time to be given to a particular process before the ISR puts the running process back to ready and runs another process.

The Kernel

Memory Management

There are two main methods for memory management: partitioned memory (also called segmenting), and paged memory. There is a third, called program segmentation, but this will not form part of the main discussion as this is a method employed by a programmer, not the OS. Paged memory is far more common (and efficient), so this section mostly discusses this.

Some facts relating to a basic understanding are:

- The basic method for implementing paging involves dividing physical memory into fixed-size blocks called frames and dividing logical memory into blocks of the same size called pages.
- Paging can reduce problems associated with fragmentation that can happen if processes have to be moved between memory and the hard drive during the course of execution.
- In a virtual memory system, paging is used to create a larger logical memory space than exists physically.
- In a virtual memory system, it is often that not all pages are needed or can be in memory at the same time , so there has to be operating system control of which pages are in memory at any one time.
- Disk thrashing is a description of the state where a process running in one page needs access to a different page which is on disk but when this page is transferred into memory it immediately needs access to the page that has just been removed from memory.

Allocating Space (memory management).

The kernel is responsible for allocating memory space to programs and data, and recovering the memory when it is no longer needed. When you open a program, the kernel decides where it is going to put it in your RAM so that it doesn't interfere with other programs. You may have lots of programs running at the same time and lots of data files open. They must all be allocated space so that they can work without corrupting the other applications or data files, and the kernel is responsible for this, and keeping a record of where everything is. When you close a program, the kernel 'reclaims' the memory that the program used by updating its records. That space can then be allocated to another program when needed. If any programs are not running correctly, their access to memory is ring-fenced, so they can't interfere with memory locations not allocated to it and thereby memory is protected. The section of the kernel responsible for this is the memory management unit. There are three main memory management strategies when dealing with multi-tasking operating systems:

- Memory segmenting
- Paging/virtual memory (there is a difference)
- Program segmentation

Memory Partitioning and segmenting

Memory segmentation is the division of a computer's primary memory into continuous segments or sections of a fixed or variable size. In a computer system using segmentation, a reference to a memory location includes a value that identifies a segment and an offset within that segment. Segmenting can be handled in different ways by different operating systems. For example, some OS will allocate blocks to different functions and schedule programs to load when there are enough segments available to load the whole program. Segmenting and paging (described next) are VERY similar concepts, but the method in which they manage the block sizes is different. Paging is small equal sized blocks, segments can vary in size.

More can be read here: https://study.com/academy/lesson/what-is-memory-partitioning-definition-concept.html

At the risk of oversimplification:

Segmentation allows a process to be allocated a block of memory appropriate for its size. This can either be a fixed set of different sized blocks or dynamic (variable), being set specifically for a program. Generally, it is considered that segmentation is the next step up from partitioning.
virtual memory (using paging) allows a process to access more memory than the system physically supports by making use of secondary storage.

Paging & Virtual Memory

Paging is a very similar concept to segmenting but instead programs are divided up into pages (each have the same fixed size). As programs are split up into pages, memory is split into equally sized page frames. This is what paging means. Rather than a process being given a range of memory, it is instead assigned to a number of pages. pages do not have to be loaded into contiguous page frames, or even in order. A page table keeps track of all the processes' pages and which page frame they are loaded into.

The difference between paging and virtual memory, is the ability to use secondary storage to hold these pages, meaning there is an imaginary increase in memory than is really present. The extra memory, created by use of secondary storage, is virtual. In virtual memory, the page table is used to translate the virtual memory location (of the page) to the physical location in memory or on secondary storage, as described next.

Paging is more useful because pages can be swapped in and out of memory and stored on secondary storage when they are not currently being used. This is the concept of virtual memory and the physical file on the hard disk called a page file. The page file is divided up into logical units the same size as memory pages. You should notice that the page file is nearly always at least the same size as RAM, although modern operating systems take many factors into account when left to automatically manage the page file.

When a process tries to access a page of memory that that has is mapped in the virtual address space, but not actually loaded into main memory, an interrupt called a page fault is triggered. The memory scheduler (high level scheduler) will then run and ensure that the necessary page is then loaded into a page frame (and other pages swapped out if needed).

Imagine you have 1 GB of RAM (unrealistic, but we'll go with it for simple purposes). Your OS already takes up 256MB leaving 768MB RAM. You load a spreadsheet, web browser and computer game which uses up all the RAM. You then want to load a PDF, for which there is no physical memory left. The memory management unit will determine which pages are not that instant being used and moves them from RAM to the page file. The page table is updated to indicate that that page is now on secondary storage. Enough pages are cleared and you can now load your PDF. Now imagine you switch back to your spreadsheet, the page table might indicate that part of the program is not in RAM, but in the page file. The memory management will repeat the process, removing pages to the secondary storage to free up enough ram to reload those previous pages.

This is the main reason why more RAM can speed up your computer. If you notice that your hard disk is constantly reading and writing, it could be the virtual memory swapping many pages back and fourth.

The Purpose of the Page File or Swap Partition in simple terms (from HowToGeek)

The purpose of the page file on Windows or swap partition on Linux is to provide additional working memory to your computer. For example, if your computer has 2 GB of RAM and you open a large number of programs or large number of files, your computer might need to store 3 GB of data in its working memory. The computer stores that additional 1 GB of data in its page file or swap space. The page file or swap acts as an “overflow” area to hold the additional data. Your computer automatically transfers data back to its RAM when it’s being used, and moves data to its page file or swap partition when it’s not being used.

If you used an older desktop computer, you could see this happen after you minimised a desktop program for a while. When you maximised it later, it would take a while to appear, and you’d hear your hard drive grinding away while that disk activity LED flashed — its data was being moved back from your page file or swap partition to its RAM. The RAM is much faster than the page file or swap partition. (This is much less common on modern computers that have sufficient amounts of RAM to keep desktop programs in RAM.)

Most applications expect to get the memory they request. If your RAM was full and you had no page file, and then you opened another program, the program would likely crash. Having a page file with additional space programs can use prevents this from happening.

Management Algorithms

When a page needs to be allocated a page frame, often that frame will need to be overwritten. What, therefore, should be swapped out? There are different algorithms employed by the memory management. For example, it could replace a page that is least used, by recording in the page table, the time it was loaded into memory (not how long, as that would require constant updating). Another method could be to use the longest resident. Both of these have advantages and drawbacks. For example, using longest resident, a page may be used frequently. Using least used could be problematic for pages recently loaded, which may, by the length of time in RAM, have low access. A balance of methods is often required.

Useful websites:

Program Segmentation

Segmentation allows you to break a program into functions, each of which can be loaded into memory only when required. Take a modern word processing program, that contains millions of lines of code. The programmer will break that into useful functions such as spell checking, printing, mail merge, etc. Only when you need to use that particular part of the program is it then loaded into RAM. Segments can of course be of varying sizes, depending on the code that needs to be executed.

Technically, this is a memory management technique, but one managed by the programmer, not the operating system.

This video, while discussing Android and iOS, does go into virtual memory, which may add to your understanding.

Links

A very good article on virtual memory and its need (from a Windows perspective) can be found here.