[RFC/Benchmarks] Per-CPU freelists

classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view

[RFC/Benchmarks] Per-CPU freelists

Suleiman Souhlal

I implemented per-CPU page freelists this weekend, in the hopes that it
would improve  performance on SMP machines, as it should save a spinlock
acquisition in vm_page_alloc(), in most cases (except when
VM_ALLOC_INTERRUPT is set, and when the current cpu's free list is
empty), and reduce contention. However, I was only able to test it on
machine with two CPUs, where it didn't seem to make any difference.

You can set the number of pages that get added to the freelists each
time it gets refilled in vm.pcpu.refill_num, and the maximum length of
the freelists in vm.pcpu.max_len. Some stats are viewable in vm.pcpu.stats.

The patch is available at

I would really appreciate if someone could benchmark/test this on a
machine with more processors.

Here's the output of ministat(1) for buildkernel:
x refill_num=32, max_len=64
+ refill_num=4, max_len=-1 (effectively disabling the percpu freelists)
|+         +           x  x+ x                       +       +  x         x|
|    |_____________|_______M_MA____________A____________|__________|       |
     N           Min           Max        Median           Avg        Stddev
x   5        171.64        172.09        171.69       171.816    0.21220273
+   5        171.44        171.97        171.67       171.702    0.22928149
No difference proven at 95.0% confidence

-- Suleiman
[hidden email] mailing list
To unsubscribe, send any mail to "[hidden email]"