NUMA Support is there in FreeBSD.

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

NUMA Support is there in FreeBSD.

satish kondapalli
Hi,

I am new to FreeBSD, I just want know whether FreeBSD supports NUMA.
If FreeBSD supports NUMA what are the kernel API to allocate memory?
is there any example driver or any driver which is using the NUMA API?

please provide some inputs...

Thanks
Sateesh
Reply | Threaded
Open this post in threaded view
|

Re: NUMA Support is there in FreeBSD.

mdf-2
On Mon, Oct 3, 2011 at 7:55 AM, satish kondapalli <[hidden email]> wrote:
> I am new to FreeBSD, I just want know whether FreeBSD supports NUMA.
> If FreeBSD supports NUMA what are the kernel API to allocate memory?
> is there any example driver or any driver which is using the NUMA API?
>
> please provide some inputs...

The kernel is NUMA-aware (at least for x86), and memory is allocated
round-robin amongst the memory domains.  There are not yet any KPIs
for allocating memory in a specific NUMA domain, nor for binding
specific threads / processes to get their memory local to a bound cpu
instead of round robin.

There have been several discussions but no one has taken the lead and
proposed some KPIs yet.  At $WORK the round-robin is sufficient to get
consistent performance numbers and we have not yet started any
experimentation with binding specific threads to either CPU or memory.

Cheers,
matthew
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA Support is there in FreeBSD.

Arnaud Lacombe-6
Hi,

On Mon, Oct 3, 2011 at 12:31 PM,  <[hidden email]> wrote:
> On Mon, Oct 3, 2011 at 7:55 AM, satish kondapalli <[hidden email]> wrote:
>> I am new to FreeBSD, I just want know whether FreeBSD supports NUMA.
>> If FreeBSD supports NUMA what are the kernel API to allocate memory?
>> is there any example driver or any driver which is using the NUMA API?
>>
>> please provide some inputs...
>
> The kernel is NUMA-aware (at least for x86),
>
What "x86" ? i386 ? amd64 ? both ?

> and memory is allocated
> round-robin amongst the memory domains.  There are not yet any KPIs
> for allocating memory in a specific NUMA domain, nor for binding
> specific threads / processes to get their memory local to a bound cpu
> instead of round robin.
>
I'm not sure to follow you. Say you have 2 memory domain attached to 2
different CPU package, each providing a memory domain, 4 physical core
and eventually 8 virtual. Say you have a network adapter supporting 8
RX/TX queue, dispatching RX packet to 8 netisr. Ideally, you'd want
those 8 queue/netisr to each have an affinity for a given CPU/memory
domain, have the network adapter route flow evenly on those those 8
CPU. Now, if you allocated an mbuf from memory domain 1, and end up
being processed by a CPU in domain 0, that likely to introduce
performance penalty.

Now, what about userland ?

This is certainly an horribly big picture :/

 - Arnaud

> There have been several discussions but no one has taken the lead and
> proposed some KPIs yet.  At $WORK the round-robin is sufficient to get
> consistent performance numbers and we have not yet started any
> experimentation with binding specific threads to either CPU or memory.
>
> Cheers,
> matthew
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA Support is there in FreeBSD.

mdf-2
On Mon, Oct 3, 2011 at 10:24 AM, Arnaud Lacombe <[hidden email]> wrote:

> Hi,
>
> On Mon, Oct 3, 2011 at 12:31 PM,  <[hidden email]> wrote:
>> On Mon, Oct 3, 2011 at 7:55 AM, satish kondapalli <[hidden email]> wrote:
>>> I am new to FreeBSD, I just want know whether FreeBSD supports NUMA.
>>> If FreeBSD supports NUMA what are the kernel API to allocate memory?
>>> is there any example driver or any driver which is using the NUMA API?
>>>
>>> please provide some inputs...
>>
>> The kernel is NUMA-aware (at least for x86),
>>
> What "x86" ? i386 ? amd64 ? both ?

Both; see sys/x86/acpica/srat.c which parses the SRAT table.

>> and memory is allocated
>> round-robin amongst the memory domains.  There are not yet any KPIs
>> for allocating memory in a specific NUMA domain, nor for binding
>> specific threads / processes to get their memory local to a bound cpu
>> instead of round robin.
>>
> I'm not sure to follow you. Say you have 2 memory domain attached to 2
> different CPU package, each providing a memory domain, 4 physical core
> and eventually 8 virtual. Say you have a network adapter supporting 8
> RX/TX queue, dispatching RX packet to 8 netisr. Ideally, you'd want
> those 8 queue/netisr to each have an affinity for a given CPU/memory
> domain, have the network adapter route flow evenly on those those 8
> CPU. Now, if you allocated an mbuf from memory domain 1, and end up
> being processed by a CPU in domain 0, that likely to introduce
> performance penalty.

Your statement isn't incorrect.  What I'm saying is that there's no
KPI for requesting bound memory because, while the netstat example is
a fine one for where local memory is desired, the majority [1] of
processing is not bound to a CPU and so round-robin allocations will
produce uniform performance results -- that is, not the best possible,
but not wildly fluctuating as scheduling decisions over different runs
give different remote memory penalties.

[1] for some definition of 'majority'.

> Now, what about userland ?
>
> This is certainly an horribly big picture :/

Yes, and it's why I said just that there's no KPI.  One reason there
is no KPI is that there's a lot of fiddly bits to take into account.

My experience at IBM on AIX was that NUMA is very easy to get wrong;
specifically what one usually wants is for the OS to get the answer
right (especially for userspace) without a lot of manual tuning;
except for some specific applications like netstat queues or a machine
doing HPC or mostly running e.g. an Oracle db server, there's too much
happening for any one program to configure itself "right" for all the
uses of that code.  I remember a lot of customer reports of problems
from overly aggressive local memory use.  Most of the time no one
complained when things had consistent performance, even if that wasn't
quite as fast as possible.

In fact, I may be wrong about the round-robin; I sent jhb@ a patch and
I have no recollection anymore whether it's actually in CURRENT.  It's
been over a year since I thought about this much (BSDCan 2010 was the
last time I remember).

Cheers,
matthew
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA Support is there in FreeBSD.

Arnaud Lacombe-6
Hi,

[Add jhb@ to the CC list]

On Mon, Oct 3, 2011 at 1:34 PM,  <[hidden email]> wrote:

> On Mon, Oct 3, 2011 at 10:24 AM, Arnaud Lacombe <[hidden email]> wrote:
>> Hi,
>>
>> On Mon, Oct 3, 2011 at 12:31 PM,  <[hidden email]> wrote:
>>> On Mon, Oct 3, 2011 at 7:55 AM, satish kondapalli <[hidden email]> wrote:
>>>> I am new to FreeBSD, I just want know whether FreeBSD supports NUMA.
>>>> If FreeBSD supports NUMA what are the kernel API to allocate memory?
>>>> is there any example driver or any driver which is using the NUMA API?
>>>>
>>>> please provide some inputs...
>>>
>>> The kernel is NUMA-aware (at least for x86),
>>>
>> What "x86" ? i386 ? amd64 ? both ?
>
> Both; see sys/x86/acpica/srat.c which parses the SRAT table.
>
>>> and memory is allocated
>>> round-robin amongst the memory domains.  There are not yet any KPIs
>>> for allocating memory in a specific NUMA domain, nor for binding
>>> specific threads / processes to get their memory local to a bound cpu
>>> instead of round robin.
>>>
>> I'm not sure to follow you. Say you have 2 memory domain attached to 2
>> different CPU package, each providing a memory domain, 4 physical core
>> and eventually 8 virtual. Say you have a network adapter supporting 8
>> RX/TX queue, dispatching RX packet to 8 netisr. Ideally, you'd want
>> those 8 queue/netisr to each have an affinity for a given CPU/memory
>> domain, have the network adapter route flow evenly on those those 8
>> CPU. Now, if you allocated an mbuf from memory domain 1, and end up
>> being processed by a CPU in domain 0, that likely to introduce
>> performance penalty.
>
> Your statement isn't incorrect.  What I'm saying is that there's no
> KPI for requesting bound memory because, while the netstat example is
> a fine one for where local memory is desired, the majority [1] of
> processing is not bound to a CPU and so round-robin allocations will
> produce uniform performance results -- that is, not the best possible,
> but not wildly fluctuating as scheduling decisions over different runs
> give different remote memory penalties.
>
> [1] for some definition of 'majority'.
>
>> Now, what about userland ?
>>
>> This is certainly an horribly big picture :/
>
> Yes, and it's why I said just that there's no KPI.  One reason there
> is no KPI is that there's a lot of fiddly bits to take into account.
>
> My experience at IBM on AIX was that NUMA is very easy to get wrong;
> specifically what one usually wants is for the OS to get the answer
> right (especially for userspace) without a lot of manual tuning;
> except for some specific applications like netstat queues or a machine
> doing HPC or mostly running e.g. an Oracle db server, there's too much
> happening for any one program to configure itself "right" for all the
> uses of that code.  I remember a lot of customer reports of problems
> from overly aggressive local memory use.  Most of the time no one
> complained when things had consistent performance, even if that wasn't
> quite as fast as possible.
>
Is there any project in progress to get this addressed ? In the past
year, I can only see 3 commit related to NUMA, one of them being
concerning only ia64.

Btw, I'd be interested to see how FreeBSD 9.0 and a recent Linux
kernel behave on +2 CPU package machines.

 - Arnaud

[0]: http://lwn.net/Articles/254445/

> In fact, I may be wrong about the round-robin; I sent jhb@ a patch and
> I have no recollection anymore whether it's actually in CURRENT.  It's
> been over a year since I thought about this much (BSDCan 2010 was the
> last time I remember).
>
> Cheers,
> matthew
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA Support is there in FreeBSD.

Lev Serebryakov
In reply to this post by mdf-2
Hello, Mdf.
You wrote 3 октября 2011 г., 21:34:29:

> Your statement isn't incorrect.  What I'm saying is that there's no
> KPI for requesting bound memory because, while the netstat example is
> a fine one for where local memory is desired, the majority [1] of
> processing is not bound to a CPU and so round-robin allocations will
> produce uniform performance results -- that is, not the best possible,
> but not wildly fluctuating as scheduling decisions over different runs
> give different remote memory penalties.
    We have exactly the same config at ${WORK}, as Arnaud describes. And
  we need to process huge (4Gbit+ wire speed in small -- 100-1000
  bytes -- packets) UDP traffic. Without fixed affinity of "netisr"
  threads our system drops some packets on the way between DMA-mapped
  network card buffers and kernel structures. One big difference: we
  use Solaris and it have all needed API, KPI and userland control
  utilities to tune system, both kernel-side and userland-side. Even
  Solaris, though, could no process such traffic "automagically". We
  didn't try FreeBSD, as our ops knows nothing about it (I'm only
  FreeBSD fan in team  and I'm developer, not operations)...

    I wrote this as example, that for some tasks system NEEDS all these
  NUMA-specific knobs.

    BTW, NUMA-aware allocator in HotSpot (Sun's, errr, sorry, Oracle's
  Java VM), added between Java6 and Java7, increased performance for some
  workloads up to 300% on 72-way system (SunFire 15K), and gives about
  3% performance drop on worst situations :) And it was allocator in
  virtual machine! But it would have been impossible without kernel
  API, so this changes works well only on HotSpot/Solaris ;-)

    Again, I wrote this all to show, that NUMA-awareness could be very
  useful on big iron.

--
// Black Lion AKA Lev Serebryakov <[hidden email]>

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"