CLASS 29

Virtual Memory

We noted how the CPU is shared -- it is multiplexed in time. The approach to sharing memory is different. Whereas we share the CPU by rapidly switching it between users (time-sharing), we share memory by dividing it up (space-sharing).

The mechanism used to share and protect memory is called Virtual Memory (VM). The idea behind virtual memory is that addresses generated by the CPU are inspected and translated before they are allowed to affect memory.

In a VM system, addresses generated by the CPU are called Virtual Addresses (VAs). They are translated by a piece of hardware called the Memory Management Unit (MMU) into actual addresses, which are called Physical Addresses (PAs). The hardware organization looks like this:


        +--------------+
        |    CPU       |                     +---------------+
        |              |---> MMU ---> MAR--->|               |         
        |              |                     |  MEMORY       |
        |              |                     |               |
        |              |<------------>MDR<-->|               |
        |              |                     |               |
        |              |                     +---------------+
        +--------------+

The MMU intercepts memory references and inspects and translates them. This allows the system to "redirect" requests for memory; this redirecting gives the system the ability to enforce space-sharing of memory. For example a memory request for location 9372, as would be generated by, say

        ld [9372],%l0

might be redirected by the MMU to access the actual physical memory location 21660 instead.

How translation is done differs from machine to machine; in our discussion we'll examine how translation is performed on the SPARC (naturally!).

Implementation of VM on the SPARC

Address translation is done on a per-process basis. Each process has a table that specifies how virtual addresses are converted to physical addresses. When a process uses a VA (in a load or store instruction) the table for that specific process is used to translate that address. Rather than having a separate translation for each individual address, virtual addresses are broken up into contiguous blocks of 4K bytes called pages. All the addresses in a single page are mapped in the same way, using only a single line in the page table.

Physical memory is divided up into 4K slots called page frames. Page frames are the smallest chunks of memory that can be protected. Each virtual page is mapped to a page frame by the process's page table.

The byte within a page is specified by the last 12 bits of the virtual address. The first 20 bits of the address are the Page ID. The page table specifies a 20 bit page frame for each 20 bit Page ID.

Page Table:
        ------------------------------------------------
        |  Page ID (20 bits)  |  Page Frame (20 bits)  |
        |                     |                        |
                 .                       .
                 .                       .
                 .                       .

The fundamental form of protection afforded by VM is due to the partitioning of physical memory into page frames, which are used by separate users (Unix processes). By assigning page frames to users, and preventing users from accessing page frames that do not belong to them, virtual memory provide the protection necessary to support memory sharing.

Another form of protection provided by VM is to protect each user from themselves. This is done using two extra bits associated with each page table entry:


 --------------------------------------------------------------
 |  Page ID (20 bits)  |  Page Frame (20 bits)  | R/O? | R/W? |
 |                     |                        |      |      |  
         .                       .                .       .
         .                       .                .       .
         .                       .                .       .

When an address is presented to the MMU, the MMU checks wether it is a load (which reads memory) or a store (which writes memory). If the R/W bit is not set, and the operation is a store, then the MMU signals a Segmentation Violation.

Demand Paging

The OS doesn't need to set aside page frames for all the pages used by the program. If a page doesn't exist, when the application generates a reference to that page, the MMU will signal a page fault. The OS handles a page fault by

finding a page frame for the new page
creating an entry in the page table for the new page
restarting the instruction

The OS only needs to create a small number of pages to get the program running. After that, it relies on page faults to determine what additional pages to create. This is the principle of demand paging.

Noncontiguous memory allocation.

Generally the OS gets a program running by creating four pages: one each for text, data, bss, and stack. Note that there is a huge amount of unused virtual addresses between the program/data/bss (in low memory) and the stack (in high memory). However, this doesn't waste any physical memory because those pages aren't valid, and aren't mapped to any page frames.

Swapping.

When the OS needs a new page frame, and all the page frames are in use (that is, all of physical memory has been allocated), it selects a page frame to be written to disk. The OS:

writes the contents of the page frame to disk, then
invalidates the page table entry that used to point to that page frame (sets R/O and R/W bits to zero)

This frees up the page frame for use by another process. To support this, the MMU keeps track of two more bits associated with each page table entry, the "read" and "written" bits:

------------------------------------------------------------------------
|  Page ID (20 bits)  |  Page Frame (20 bits)  | R/O? | R/W? | R? | W? |
|                     |                        |      |      |    |    |
         .                       .                .       .     .    .
         .                       .                .       .     .    .
         .                       .                .       .     .    .

Whenever a page is created, or read in from disk, its R? and W? columns are set to 0. Every time a memory location is read, the R? column in its page table entry is set to 1. Every time a memory location is written, the W? column in its page table entry is set to 1. The advantage of this statistic is that if a page has not been written to (is not dirty) it doesn't have to be copied to disk when its page frame is reclaimed.

So, VM provides 2 basic services:

It provides memory protection, so multiple users can use memory simultaneously, and so individual users can create segments with R/W or R/O status.
It frees the program from physical memory constraints.

There is a cost to VM, however. It slows programs down to be translating addresses, and page faults (bringing data in from disk) are very slow.

When do we need Virtual Memory?

VM is required when the computer is multi-user, or when programs are too large to fit into memory (example: Sun workstation). VM is not necessary when the computer runs only one application at a time, and that application always fits in memory (example: PC running DOS, or Mac running plain Finder). However, when the computer is running multiple programs that fit in memory, but all programs are run by the same user, the use of VM is optional. On one hand, VM prevents programs from touching each other's memory, which usually results in a system crash. On the other hand, VM slows the system down. As a result, some multi-program, single-user system use VM (like OS/2) and some don't (like Windows and Multifinder). As a result, systems like those running OS/2 are much less likely to crash than are those running Windows or Multifinder.

Review of VM

Virtual memory uses disk space as an extension of physical memory. At any point in time, some of a program's data is in memory, and the rest of it is on disk. The program's data is brought into memory as necessary via the page fault mechanism.

A page fault occurs when the program accesses a memory location that does not currently have a valid mapping, that is, there is no page frame in physical memory with that address's data in it. Page faults are handled by the OS. During a page fault:

the MMU causes your program to be suspended, and activates the operating system
the OS selects a page frame for the new page
if the current page frame is "dirty", the OS writes the contents to disk
the OS invalidates the page table entry that used to point to this page
the OS creates an entry in the page table for the new page
if the new page is on disk, the OS loads it in from disk.
the OS restarts your program, right at the instruction that was suspended

Pages are dirty if they have been written to since they were last brought in from disk. The OS uses the W? bit in each page table entry to keep track of this.

The mapping from virtual pages to page frames is defined by the page table, and there is one page table per process.

Locality of reference

Page faults are expensive. Each time a page fault occurs, the program has to be suspended and the 7 steps above need to be performed. The really slow part is the disk accesses.

Luckily, page faults don't occur all that often because of locality of reference. Locality of reference means that the memory references made by a program tend to be close to each other from reference to reference. By close, we mean within a small number of addresses -- less than the 4K page size. Can you see why this is likely the case for program instruction fetches? Most of the time the next instruction (memory location + 4) follows the current one. However, it's also true for data references because of things like arrays - programs often read an entire array in order, so the references are very close.

Since programs exhibit locality of refernence, page faults aren't all that common. Once a page is brought in, it tends to be used over and over.

Locality of reference is another example of a measurement-based design decision (like register windows). It is locality of reference that makes virtual memory feasible. If programs had no locality of reference, they would page fault all the time and VM would be too slow to use.

In fact, there is at any point in time some set of pages that are being used a lot. For example, the page containing the code currently executing, plus the page containing the top of the stack, plus a page or two of data and bss. If these pages are all in memory, the program can execute for a long time without generating a page fault. These pages are called the program's working set.

Naturally, the working set changes over time, for example as the program counter moves through the program; but the working set changes slowly, so page faults are not frequent.

Page Replacement Policy

How does the OS decide which page frame to use during a page fault? Choosing a page at random is a bad idea -- it could generate even more page faults by kicking out pages that are part of the working set. A better idea is to try to guess what the working set of each program is.

One popular approach is the Least Recently Used (LRU) algorithm. The idea is to keep track of when each page was last accessed. To satisfy a page fault, the OS frees up the page frame that was used furthest in the past -- in some sense, least likely to be part of any program's working set.

If the working set is larger than the number of available page frames, no algorithm will work well, and page faults will be frequent. A program that generates page faults frequently and continuously is said to be thrashing.

The Memory Hierarchy

VM pages that are not being used are stored on disk in a special area called swap space. This is not a filesystem, it's just some disk space that is used to keep copies of everything that might need to be in memory.

Since any data stored in VM is available also on disk, VM is really just a copy of what's on disk. What is the distinction then, between main memory and disk memory? Only speed of access. Main Memory is faster than disk, that's the only important difference. For this reason main memory is sometimes called primary storage while disk is called secondary storage.

The idea behind VM is almost always to have the data needed where it can be accessed the fastest, namely in memory. However, when all else fails, you can always find the data on disk.

In this way, VM simulates a large, fast memory by using a small, fast memory combined with a large, slow memory. This is much cheaper than building a large, fast memory, and it is only a little bit slower than a large, fast memory (for programs with locality of reference, that is).

The memory hierarchy is usually shown as a pyramid:


                    CPU
              /             \
             / ------------- \ 
            /   Main Memory   \
           / ----------------- \
          /     Disk Memory     \
         / --------------------- \

which illustrates that we have a smaller amount of main memory than disk memory, and that the CPU directly accesses only the main memory, even though the REAL storage for data is on the disk.

Cache Memory

Cache is the natural extension of VM. Cache memory is the "top" of the memory hierarchy.


                /   CPU   \
               / --------- \
              / Cache Mem.  \
             / ------------- \ 
            /   Main Memory   \
           / ----------------- \
          /     Disk Memory     \
         / --------------------- \

Cache memory is organized similarly to Virtual Memory, however cache memory is:

smaller than main memory
faster than main memory
more expensive per byte than main memory.

In fact, looking at typical numbers for the memory hierarchy, we can see:

                /   CPU   \               Cost/Byte      Speed         Size
               / --------- \
              / Cache Mem.  \                $$$          10 ns        256 KB
             / ------------- \             
            /   Main Memory   \               $$         100 ns        32 MB  
           / ----------------- \
          /     Disk Memory     \              $          10 ms        1 GB
         / --------------------- \

The cost/byte and speed differences arise because the technologies are different. Cache memory is implemented using a technology called "Static Random Access Memory" or SRAM, while main memory is implemented using a technology called "Dynamic Random Access Memory" or DRAM, and disk memory is implemented using magnetic technology (like audio or video tapes).

Analogies between cache and VM

        Managed Units   Typical Size    "Data not Present" Event
        --------------  ------------    -------------------------
VM      Pages           4K Bytes        Page Fault
Cache   Cache Lines     256 Bytes       Cache Miss

Just as we saw that VM was a copy of data on disk, which was made so as to increase its access speed, so the cache is a copy of data in VM, again made to increase its access speed. So the CPU actually only operates on data from the cache -- it never goes directly to memory or to disk. If needed data is not in the cache (a cache miss), it is retrieved from main memory. If the data is not in main memory either (a page fault) it is retrieved from disk.

Since the addresses used to look up data in the cache are virtual addresses (no translation is performed on them) the cache must be cleared out (flushed) whenever a new process starts executing on the CPU.