The memory architecture used by an operating system is the most important key to understanding how the operating system does what it does. When you start working with a new operating system, many questions come to mind. “How do I share data between two applications?”, “Where does the system store the information I’m looking for?”, and “How can I make my program run more efficiently?” are just a few.
Every process is given its very own virtual address space. For 32-bit processes, this address space is 4 GB because a 32-bit pointer can have any value from 0x00000000 through 0xFFFFFFFF. This range allows a pointer to have one of 4,294,967,296 values, which covers a process’ 4-GB range. For 64-bit processes, this address space is 16 EB (exabytes) because a 64-bit pointer can have any value from 0x00000000’00000000 through 0xFFFFFFFF’FFFFFFFF. This range allows a pointer to have one of 18,446,744,073,709,551,616 values, which covers a process’ 16-EB range. This is quite a range!
Because every process receives its own private address space, when a thread in a process is running, that thread can access memory that belongs only to its process. The memory that belongs to all other processes is hidden and inaccessible to the running thread.
In Windows, the memory belonging to the operating system itself is also hidden from the running thread, which means that the thread cannot accidentally access the operating system’s data.
Before you get all excited about having so much address space for your application, keep in mind that this is virtual address space—not physical storage. This address space is simply a range of memory addresses. Physical storage needs to be assigned or mapped to portions of the address space before you can successfully access data without raising access violations.
Each process’ virtual address space is split into partitions. The address space is partitioned based on the underlying implementation of the operating system. Partitions vary slightly among the different Microsoft Windows kernels. Table below shows how each platform partitions a process’ address space
The partition of the process’ address space from 0x00000000 to 0x0000FFFF inclusive is set aside to help programmers catch NULL-pointer assignments. If a thread in your process attempts to read from or write to a memory address in this partition, an access violation is raised.
Error checking is often not performed religiously in C/C++ programs. For example, the following code performs no error checking:
int* pnSomeInteger = (int*) malloc(sizeof(int));
*pnSomeInteger = 5;
If malloc cannot find enough memory to satisfy the request, it returns NULL. However, this code doesn’t check for that possibility—it assumes that the allocation was successful and proceeds to access memory at address 0x00000000. Because this partition of the address space is off-limits, a memory access violation occurs and the process is terminated. This feature helps developers find bugs in their applications. Notice that you can’t even reserve virtual memory in this address range with functions of the Win32 application programming interface (API
This partition is where the process’ address space resides. The usable address range and approximate size of the user-mode partition depends on the CPU architecture, as shown in next table
A process cannot use pointers to read from, write to, or in any way access another process’ data residing in this partition. For all applications, this partition is where the bulk of the process’ data is maintained. Because each process gets its own partition for data, applications are far less likely to be corrupted by other applications, making the whole system more robust.
In Windows, all .exe and dynamic-link library (DLL) modules load in this area. Each process might load these DLLs at a different address within this partition (although this is very unlikely). The system also maps all memory-mapped files accessible to this process within this partition.
When I first looked at my 32-bit process’ address space, I was surprised to see that the amount of usable address space was less than half of my process’ overall address space. After all, does the kernel-mode partition really need the top half of the address space? Actually, the answer is yes. The system needs this space for the kernel code, device driver code, device I/O cache buffers, nonpaged pool allocations, process page tables, and so on. In fact, Microsoft is squeezing the kernel into this 2-GB space. In 64-bit Windows, the kernel finally gets the room it truly needs.
Getting a Larger User-Mode Partition in x86 Windows
Some applications, such as Microsoft SQL Server, would benefit from a user-mode address space larger than 2 GB in order to improve performance and scalability by having more application data addressable. So the x86 version of Windows offers a mode to increase the user-mode partition up to a maximum of 3 GB. To have all processes use a larger-than-2-GB user-mode partition and a smaller-than-1-GB kernel-mode partition, you need to configure the boot configuration data (BCD) in Windows and then reboot the machine. (Read the white paper available at http://www.microsoft.com/whdc/system/platform/firmware/bcd.mspx for more details about the BCD.)
To configure the BCD, you need to execute BCDEdit.exe with the /set switch with the IncreaseUserVA parameter. For example, bcdedit /set IncreaseUserVa 3072 tells Windows to reserve, for all processes, a 3-GB user-mode address space region and a 1-GB kernel-mode address space region. The “x86 w/3 GB” row in Table 13-2 shows how the address space looks when the IncreaseUserVa value is set to 3072. The minimum value accepted for IncreaseUserVa is 2048, corresponding to the 2-GB default. If you want to explicitly reset this parameter, execute the following command: bcdedit /deletevalue IncreaseUserVa.
When you need to figure out the current value of the parameters of the BCD, simply type bcdedit /enum on the command line. (Go to http://msdn2.microsoft.com/en-us/library/aa906211.aspx for more information about BCDEdit parameters.)
Microsoft had to create a solution that allowed this application to work in a large user-mode address space environment. When the system is about to run an application, it checks to see if the application was linked with the /LARGEADDRESSAWARE linker switch. If so, the application is claiming that it does not do anything funny with memory addresses and is fully prepared to take advantage of a large user-mode address space. On the other hand, if the application was not linked with the /LARGEADDRESSAWARE switch, the operating system reserves any user-mode space between 2 GB and the start of kernel mode. This prevents any memory allocations from being created at a memory address whose high bit is set.
Note that all the code and data required by the kernel is squeezed tightly into a 2-GB partition. So reducing the kernel address space to less than 2 GB restricts the number of threads, stacks, and other resources that the system can create. In addition, the system can use a maximum of only 64 GB of RAM, unlike the 128-GB maximum available when the default of 2 GB is used.
An executable’s LARGEADDRESSAWARE flag is checked when the operating system creates the process’ address space. The system ignores this flag for DLLs. DLLs must be written to behave correctly in a large 2+ GB user-mode partition or their behavior is undefined.
In 64-bit Windows, the 8-TB user-mode partition looks greatly out of proportion to the 16,777,208-TB kernel-mode partition. It’s not that the kernel-mode partition requires all of this virtual address space. It’s just that a 64-bit address space is enormous and most of that address space is unused. The system allows our applications to use 8 TB and allows the kernel to use what it needs; the majority of the kernel-mode partition is just not used. Fortunately, the system does not require any internal data structures to maintain the unused portions of the kernel-mode partition.
Regions in an Address Space
When a process is created and given its address space, the bulk of this usable address space is free, or unallocated. To use portions of this address space, you must allocate regions within it by calling VirtualAlloc. The act of allocating a region is called reserving.
Whenever you reserve a region of address space, the system ensures that the region begins on an allocation granularity boundary. The allocation granularity can vary from one CPU platform to another. However, as of this writing, all the CPU platforms use the same allocation granularity of 64 KB—that is, allocation requests are rounded to a 64-KB boundary.
When you reserve a region of address space, the system ensures that the size of the region is a multiple of the system’s page size. A page is a unit of memory that the system uses in managing memory. Like the allocation granularity, the page size can vary from one CPU to another. The x86 and x64 systems use a 4-KB page size, but the IA-64 uses an 8-KB page size.
Sometimes the system reserves regions of address space on behalf of your process. For example, the system allocates a region of address space to store a process environment block (PEB). A PEB is a small data structure created, manipulated, and destroyed entirely by the system. When a process is created, the system allocates a region of address space for the PEB.
The system also needs to create thread environment blocks (TEBs) to help manage all the threads that currently exist in the process. The regions for these TEBs will be reserved and released as threads in the process are created and destroyed.
Although the system demands that any of your requests to reserve address space regions begin on an allocation granularity boundary (64 KB on all platforms), the system itself is not subjected to the same limitation. It is extremely likely that the region reserved for your process’ PEB and TEBs will not start on a 64-KB boundary. However, these reserved regions will still have to be a multiple of the CPU’s page size.
If you attempt to reserve a 10-KB region of address space, the system will automatically round up your request and reserve a region whose size is a multiple of the page size. This means that on x86 and x64 systems, the system will reserve a region that is 12 KB; on an IA-64 system, the system will reserve a 16-KB region.
When your program’s algorithms no longer need to access a reserved region of address space, the region should be freed. This process is called releasing the region of address space and is accomplished by calling the VirtualFree function.
Committing Physical Storage within a Region
To use a reserved region of address space, you must allocate physical storage and then map this storage to the reserved region. This process is called committing physical storage. Physical storage is always committed in pages. To commit physical storage to a reserved region, you again call the VirtualAlloc function.
When you commit physical storage to regions, you do not have to commit physical storage to the entire region. For example, you can reserve a region that is 64 KB and then commit physical storage to the second and fourth pages within the region. Figure below shows what a process’ address space might look like. Notice that the address space is different depending on which CPU platform you’re running on. The address space on the left shows what happens on x86/x64 machines (which have a 4-KB page), and the address space on the right shows what happens on an IA-64 machine (which has 8-KB pages).
When your program’s algorithms no longer need to access committed physical storage in the reserved region, the physical storage should be freed. This process is called decommitting the physical storage and is accomplished by calling the VirtualFree function.
Physical Storage and the Paging File
In older operating systems, physical storage was considered to be the amount of RAM that you had in your machine. In other words, if you had 16 MB of RAM in your machine, you could load and run applications that used up to 16 MB of RAM. Today’s operating systems have the ability to make disk space look like memory. The file on the disk is typically called a paging file, and it contains the virtual memory that is available to all processes.
Of course, for virtual memory to work, a great deal of assistance is required from the CPU itself. When a thread attempts to access a byte of storage, the CPU must know whether that byte is in RAM or on the disk.
From an application’s perspective, a paging file transparently increases the amount of RAM (or storage) that the application can use. If you have 1 GB of RAM in your machine and also have a 1-GB paging file on your hard disk, the applications you’re running believe that your machine has a grand total of 2 GB of RAM.
Of course, you don’t actually have 2 GB of RAM. Instead, the operating system, in coordination with the CPU, saves portions of RAM to the paging file and loads portions of the paging file back into RAM as the running applications need them. Because a paging file increases the apparent amount of RAM available for applications, the use of a paging file is optional. If you don’t have a paging file, the system just thinks that there is less RAM available for applications to use. However, users are strongly encouraged to use paging files so that they can run more applications and those applications can work on larger data sets. It is best to think of physical storage as data stored in a paging file on a disk drive (usually a hard disk drive). So when an application commits physical storage to a region of address space by calling the VirtualAlloc function, space is actually allocated from a file on the hard disk. The size of the system’s paging file is the most important factor in determining how much physical storage is available to applications; the amount of RAM you have has very little effect.
Now when a thread in your process attempts to access a block of data in the process’ address space, one of two things can happen, as shown in the simplified flowchart in Figure below.
The more often the system needs to copy pages of memory to the paging file and vice versa, the more your hard disk thrashes and the slower the system runs. (Thrashing means that the operating system spends all its time swapping pages in and out of memory instead of running programs.) Thus by adding more RAM to your computer, you reduce the amount of thrashing necessary to run your applications, which will, of course, greatly improve the system’s performance. So here is a general rule of thumb: to make your machine run faster, add more RAM. In fact, for most situations, you’ll get a better performance boost from adding RAM than you will by getting a faster CPU.
Physical Storage not Maintained in the Paging File
After reading the previous section, you must be thinking that the paging file can get pretty large if many programs are all running at once—especially if you’re thinking that every time you run a program the system must reserve regions of address space for the process’ code and data, commit physical storage to these regions, and then copy the code and data from the program’s file on the hard disk to the committed physical storage in the paging file.
The system does not do what I just described; if it did, it would take a very long time to load a program and start it running. Instead, when you invoke an application, the system opens the application’s .exe file and determines the size of the application’s code and data. Then the system reserves a region of address space and notes that the physical storage associated with this region is the .exe file itself. That’s right—instead of allocating space from the paging file, the system uses the actual contents, or image, of the .exe file as the program’s reserved region of address space. This, of course, makes loading an application very fast and allows the size of the paging file to remain small.
When a program’s file image (that is, an .exe or a DLL file) on the hard disk is used as the physical storage for a region of address space, it is called a memory-mapped file. When an .exe or a DLL is loaded, the system automatically reserves a region of address space and maps the file’s image to this region. However, the system also offers a set of functions that allow you to map data files to a region of address space.
Microsoft was forced to make image files executed from floppies work this way so that setup applications would work correctly. Often a setup program begins with one floppy, which the user removes from the drive in order to insert another floppy. If the system needs to go back to the first floppy to load some of the .exe’s or the DLL’s code, it is, of course, no longer in the floppy drive. However, because the system copied the file to RAM (and is backed by the paging file), it will have no trouble accessing the setup program.
The system does not copy to RAM image files on other removable media such as CD-ROMs or network drives unless the image is linked using the /SWAPRUN:CD or /SWAPRUN:NET switches
Individual pages of physical storage allocated can be assigned different protection attributes. The protection attributes are shown this table
Some malware applications write code into areas of memory intended for data (such as a thread’s stack) and then the application executes the malicious code. Windows’ Data Execution Prevention (DEP) feature provides protection against this type of malware attack. With DEP enabled, the operating system uses the PAGE_EXECUTE_* protections only on regions of memory that are intended to have code execute; other protections (typically PAGE_READWRITE) are used for regions of memory intended to have data in them (such as thread stacks and the application’s heaps.)
Region Physical Storage Types: