General Rules for Processor Scheduling
- ESX(i) schedules VMs onto and off of processors as needed
- Whenever a VM is scheduled to a processor, all of the cores must be available for the VM to be scheduled or the VM cannot be scheduled at all
- If a VM cannot be scheduled to a prcoessor when it needs access, VM performance can suffer a great deal.
- When VMs are ready for a processor but are unable to be scheduled, this creates what VMware calls the CPU %Ready values
- CPU %Ready manifests itself as a utilisation issue but is actually a scheduling issue
- VMware attempts to schedule VMs on the same core over and over again and sometimes it has to move to another processor. Processor caches contain certain information that allows the OS to perform better. If the VM is actually moved across sockets and the cache isn’t shared, then it needs to be loaded with this new info.
- Maintain consistent Guest OS configurations
- Mixing Single, dual and quad core vCPUs VMs on the same ESX(i) server can create major scheduling problems. This is especially true when the ESX Server has low core densities or when the ESX servers average moderate to high utilisation levels
- Where possible reduce VMs to single vCPU VMs except if they host an application which requires multiple CPUs or if you find reducing on to one core is not possible to due to high utilisation on both cores on that particular VM
- Keep an eye on scheduling issues especially CPU% Ready. More than 2% indicates processor scheduling issues
Performance enhancers for vSphere
- Non scheduling of idle processors
vSphere has the ability to skip scheduling of idle processors. For example if a quad processor VM has activity on only 1 core, vSphere has the ability to schedule only that single core sometimes. A multi threaded app will likely be using most or all of its cores most of the time. If a VM has CPUs that are sitting idle a lot, it should be reviewed whether this VM actually needs the multiple processors
If your application is not multi-threaded, you gain nothing by adding cores to the VM and make it more difficult to schedule
2. Processor Skew
Guest OSs expect to see progress on all of their cores all of the time. vSphere has the ability to allow a small amount of skew whereby the processors need not be completely in sync but this has to be kept within reasonable limits
For a detailed description of how ESX(i) schedules VMs to processors please read
VMware® Virtual Symmetric Multi-Processing (SMP) enhances virtual machine performance by enabling a single virtual machine to use multiple physical processors, simultaneously. A unique VMware feature, Virtual SMP™ enables virtualization of the most processor- and
resource-intensive enterprise applications such as databases, ERP and CRM.
How Is VMware Virtual SMP Used in the Enterprise?
- Create development and testing environments that are more realistic and can be quickly and easily deployed
- Run resource intensive applications in virtualized environments.
- Run enterprise application such as databases, and ERP or CRM, in virtual machines.
- Scale computing environments without adding new hardware. Allow multiple processors to work together on a workload and increase utilization of existing resources.
- Improve software development and deployment.
How Does VMware Virtual SMP Work?
VMware Virtual SMP makes it possible for a single virtual machine to span up to four physical processors, or CPUs. These processors share the same memory, and work on any task regardless of the location of the task in memory. Virtual SMP co-schedules non-idle
virtual processors synchronously while allowing over-commitment of the processors. Idle virtual processors can be de-scheduled with the guest operating system running inside the virtual machine and then re-used for other tasks. Virtual SMP periodically moves processing tasks between the available processors to re-balance the work load. Virtual SMP has built-in controls to minimize overhead on the system.
- Poor performance of servers (if any) can be attributed to too many CPUs (as the wait time for CPU can increase if there are too any CPUs e.g. for a ‘single’threaded’ app on a a multi-vcpu VM)….
- As long as the operating system is designed to support SMP then the operating system is responsible for balancing the processes among all the available CPUs as evenly as it can. In most cases, adding more CPUs to an SMP system does not increase throughput in direct proportion to the new resources because workloads cannot always take advantage efficiently of multiple CPUs. There is also some overhead involved in sharing resources and scheduling processes. For example, a four-CPU SMP system is not four times as productive as a single-CPU system. The efficiency of a multiple-CPU SMP system, versus a single-CPU system, is defined by the workload’s ability to scale, or the workload’s scaling ratio. This scaling ratio varies for different workloads.
- SMP systems are also affected by the need for locking and synchronization of resources. Before a task can modify a shared data item, it must ensure that no other task will change the data item. This is usually done by means of a lock. While a process or thread is waiting to obtain a lock, it is not productive. Further, while a thread is waiting for the lock, some of its cache lines may be replaced. Thus, when the thread is scheduled again, it may experience higher memory latency. The operating system’s kernel contains many shared data items, so it must perform synchronization internally. Synchronization delays can occur even in an application program that does not share data with other programs because the kernel services have to serialize shared kernel data. In addition to lock contention, path length also increases because more code is being executed.
- The main advantage of deploying an SMP system is the ability to use multiple processors simultaneously to execute different tasks constituting a program, thereby increasing the throughput (for example, the number of transactions per second), compared to a single-CPU system. Only workloads that support parallelisation (including multiple processes or multiple threads that can run in parallel) can benefit from SMP. Single-threaded workloads can be scheduled to only one CPU at the time; and thus, cannot take advantage of additional CPUs being available. On the other hand, some modern applications with significant computational components have built in multi-threaded structures and a high scalability ratio. Good examples of the latter are Microsoft® SQL Server and Microsoft Exchange. Microsoft recommends deploying these applications on SMP systems with at least two CPUs