Some recent hardware has asymmetric cache layout.
For example, Intel Core 2 Quad has 4 cores, but only 2 cores in pair share L2.
Thus the performance in multithreaded benchmarks shows high variance depending on
which core the VM_Processor is mapped to.
The patch assigns CPU number to the pthreads to have more predictable results.
Any users can only modify the userlevel scheduler to do further optimizations for cache affinity studies.
The patch works on ia32-linux with NPTL