(Ennek a részei nagyrészt angolul voltak meg korábbról, így ide is angolul írom.)
Handling memory in virtualization can be a complex topic, especially in today’s hypervisors, where a lot of optimization techniques are implemented. This is a slightly older material collected for ESX 4.1, but most of it still can be applied to ESX 5.0 too.
Goal: We wanted to answer a simple question: how much memory does currently a virtual machine consume. Of course this is a rather complicated question, one has to understand the basic of memory management in virtualized systems and know the exact meaning of the numerous memory-related performance metrics presented by ESX.
There are a lot of in-depth materials for memory management in ESX:
- For a detailed description of memory management in ESX/ESXi see the vSphere Resource Management Guide (it includes memory virtualization techniques, overcommitment methods, shares/reservation/limits). The section “Measuring and Differentiating Types of Memory Usage” is especially relevant for understanding the different metrics.
- Understanding Memory Resource Management in VMware ESX 4.1 (VMware Technical Paper) focuses also on memory management in detail.
- Another great source of information on performance is the tutorial “VMware Performance for Gurus”.
- The official documentation of the memory metrics can be found in the VI API, but it is very limited, and some counters are missing.
- A much better explanation is “Memory Performance Chart Metrics in the vSphere Client” (by jbrodeur, VMware Community DOC-10398).
- The “Interpreting esxtop 4.1 statistics” (by haiping, VMware Community DOC-11812) contains also a lot of information. However, esxtop uses a different set of counters, and for some of the counters there is no equivalent in the VI API. (The “Memory Performance Analysis and Monitoring [by Scott Drummonds, VMware Community DOC-5430] document contains some mappings.)
- “VirtualCenter Memory Statistic Definitions” (by Kit Colbert) contains some explanations to the metrics, and a simple example to calculate granted/consumed and shared/shared common (it is for VI 3.5, but much of the information is valid for VI 4.x).
2. Description of the metrics
What we were really missing in the beginning was an overview figure illustrating the relationships between the counters. A good start is the figure in “Understanding ESX Memory Management at Partner Exchange 2009“, but it lacks some details.
Thus I have put together the following figure based on the information in all the above sources and some experiments with a simple VM. Texts in blue represent metric names in the VI API.
NOTE: the regions are not so well-aligned, they can be distributed along the whole address spaces, this is just for easier illustration.
VM level: The virtual machine has some total memory allocated, this is the guest physical memory.
- As long as the VM does not touch a memory page at least once, ESX does not care about that page. These are the “not touched” region, there is no stat for its size.
- If there is no memory pressure on the host and no memory overcommit technique is active, then all the other pages has to be mapped to some machine memory pages. This is represented in mem.granted.
- From the granted memory, some of the pages are fully zeroed out. ESX detects them at some rate, and includes them in mem.zero. These pages are mapped to the same zero page in the machine memory, thus they are stored only in one instance.
- The Transparent page sharing (TPS) component in ESX detects also those pages, which contain exactly the same data as other pages (in the memory of this VM or other VMs on the same host). These will be also stored only in one instance in machine memory. The number of these pages is mem.shared (this includes also zero pages).
- A different subset of the granted memory is mem.active. ESX estimates this using sampling and averages, and it takes time to detect the actual value [from Interpreting esxtop: “VMKernel estimates active memory usage for a VM by sampling a random subset of the VM’s memory resident in machine memory to detect the number of memory reads and writes. VMKernel then scales this number by the size”]. Active memory could be shared or private.
If there is not enough memory on the host to satisfy all requests, ESX uses ballooning, memory compression or host level swapping.
- VMware Tools can reserve memory inside the VM (“ballooning”), these memory pages then would not be mapped to any machine memory page. The current size of the balloon is represented by mem.vmmemctl.
- ESX can compress some of the guest memory pages, and store them in a compressed format in the machine memory (“memory compression”). The description of these counters is missing from the API documentation (see my question regarding this here), but it seems from our tests that mem.zipped is the relevant counter.
- Finally, some pages can be swapped out to the host swap file. The current amount is stored in mem.swapped.
Thus the total memory of the VM is divided into the following areas:
total_memory_size = granted + vmmemctl + swapped + zipped + not_touched
Host level: in machine memory the following parts are stored on the account of the current VM:
- Running a VM requires some extra memory to store auxiliary structures. This is mem.overhead, it’s size depends on the number of vCPUs, number of total VM memory and the number of the processes inside the guest (as a shadow page table has to be maintained for every guest page table) .
- Some parts of the guest memory are actually stored in machine memory, this is reflected in mem.consumed.
- “When multiple VMs are sharing a single region of machine memory, each VM is ‘charged’ for the memory proportionally based on the total references to that region of shared memory.” For calculating mem.consumed see .
- (Note: we did not found any information whether consumed includes also the compression cache, but it seems logical. The actual size of the compression_cache is maybe mem.zipped – mem.zipSaved ??)
Thus at the host level the memory consumed by a virtual machine is computed as:
3. Tests with a simple VM
Next we did some tests to check, whether the above assumptions are correct. The VM was placed in certain situations to trigger the different memory overcommitment techniques, and the different memory metrics were collected using the Get-Stat cmdlet of PowerCLI.
Test VM: Windows Server 2003 with VMware Tools, 1024 MB RAM; running in an ESX 4.1 host, host is not overcommitted (more than 8 GB free RAM); no reservation or limit was initially set on the VM.
3.1 Startup of the VM
The following can be observed on the picture:
- granted [purple on top] climbs quickly (Windows touches all memory during boot), tops at 1048540 KB and stays constant
- no memory pressure on the host, thus swap/balloon/compression remains 0
- active [light blue with steps] decreases after a few initial peaks as the VM does nothing after booting, and after each sampling ESX estimates this correctly.
- overhead [yellow] is mostly constant at 53856 KB (small changes occur)
- after the start, transparent page sharing (TPS) kicks in, detects zero pages [climbing red] and some shareable nonzero pages also [shared, climbing light blue] (difference between the two: 680568 KB vs. 693872 KB). In the meantime consumed [green] decreases, as zero pages gets unmapped.
The changes in the relevant performance metrics from VI API:
3.2 Using ballooning
To simulate memory pressure on the host, a limit of 700 MB was added to the VM (while its memory size is still 1024 MB).
With Sysinternal’s testlimit.exe 700 MB of memory was touched and leaked in the VM:
testlimit.exe -d 100 -c 7
The relevant performance metrics can be seen on the following figure:
The legend for the above graph:
But it can be better followed by looking at the numbers:
The following happens:
- around line 8 memory is reserved in the VM, mem.consumed increases over the limit, thus ballooning kicks in [vmmemctltarget > vmmemctl, thus the balloon inflates],
- as the balloon increases, granted memory decreases. Consumed also decreases, because ballooned memory is not mapped to machine memory,
- after TPS detects zero pages, those gets shared, thus balloon starts to deflate.
3.3. Memory compression
VMware Tools was uninstalled, thus ballooning cannot be used. In this way, first compression is used when the host runs out of memory.
It can be seen, that granted starts to decrease as memory is compressed.
3.4 Swapping & compression
Test: VMware Tools uninstalled + high memory activity inside the guest + 300 MB limit on the VM.
When ballooning is not active, and the memory pressure is even greater, ESX uses host level swapping:
The values of the performance metrics obtained through PowerCLI:
What can be seen from the numbers:
- If swaptarget > swapped, then ESX starts to swap. If pages are moved to the swap file, then swapout increases.
- Pages are only moved back to the memory when they are needed, in this case swapin increases.
- Both swapout and swapinare cumulative counters, which start when the VM is powered on.
- swapped = swapout – swapin
Conclusion: we can relax, as the numbers (more or less) backed up the theoretical calculations of the different metrics. Thus the summary figure can help to answer our initial question of what metrics should be consulted about the VM’s memory consumption.
Update: You can find a version of the PowerShell script for collecting statistics here. This is a slightly newer version, I used it with vCenter 5, but probably it works also for vCenter 4. The metric names and the name of the vCenter to query is hardcoded, but after adjusting these the script should work. However, I only tested it in a simple environment.