NUMA?

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

NUMA?

Ivan Voras
Hi,

As even Intel's new CPUs have integrated memory controllers and thus
become NUMA, I'm interested in what is, in theory (I'm not proposing to
do it, I'm just curious), necessary to change in an OS to support NUMA.
My guess is:

1) node topology detection - something similar to what ULE does but also
recording which memory ranges are "close" to which CPU and the
"distance" between nodes/CPUs
2) on new image load (exec), pick a node for it, among "least used"
nodes and record the choice per-proc; on fork, keep the new process on
the same node
3) schedule threads on a CPU from the proc's node if at all possible
(e.g, when a 6-core CPU is still 1 node), then on a "near" node from a
list of distances sorted in order of cost
4) allocate new pages for a proc from its node's memory range(s) if at
all possible.

Is this all?

On the other hand, did someone do a study of performance increase for
todays "consumer" NUMA systems (e.g. 2-4 sockets/nodes x86/x64 systems)
- is it worth it?

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA?

Julian Elischer
Ivan Voras wrote:
> Hi,

I did the AMD course a few weeks ago so I'm also very interested in this..

>
> As even Intel's new CPUs have integrated memory controllers and thus
> become NUMA, I'm interested in what is, in theory (I'm not proposing to
> do it, I'm just curious), necessary to change in an OS to support NUMA.
> My guess is:
>
> 1) node topology detection - something similar to what ULE does but also
> recording which memory ranges are "close" to which CPU and the
> "distance" between nodes/CPUs

at a minimum, this is needed before anything else can really work.

> 2) on new image load (exec), pick a node for it, among "least used"
> nodes and record the choice per-proc; on fork, keep the new process on
> the same node

In some cases it may be worth having multiple copies of teh read-only
text segments.
For example, it may eventually be worth having a /bin/sh text segment
in each CPU's memory space.


> 3) schedule threads on a CPU from the proc's node if at all possible
> (e.g, when a 6-core CPU is still 1 node), then on a "near" node from a
> list of distances sorted in order of cost

this is where it really starts getting hairy.. when do you migrate a
process? and what if there are as many threads runnable as processors?

> 4) allocate new pages for a proc from its node's memory range(s) if at
> all possible.
>
> Is this all?

There are other interesting effects too..

assigning network interrupts to processors that have good access to
the hardware AND the destination if you can..


>
> On the other hand, did someone do a study of performance increase for
> todays "consumer" NUMA systems (e.g. 2-4 sockets/nodes x86/x64 systems)
> - is it worth it?

caches hide a multitude of sins..

>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-smp
> To unsubscribe, send any mail to "[hidden email]"

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA?

Marc Wiz
In reply to this post by Ivan Voras
On Thu, Nov 13, 2008 at 01:35:28AM +0100, Ivan Voras wrote:

> Hi,
>
> As even Intel's new CPUs have integrated memory controllers and thus
> become NUMA, I'm interested in what is, in theory (I'm not proposing to
> do it, I'm just curious), necessary to change in an OS to support NUMA.
> My guess is:
>
> 1) node topology detection - something similar to what ULE does but also
> recording which memory ranges are "close" to which CPU and the
> "distance" between nodes/CPUs
> 2) on new image load (exec), pick a node for it, among "least used"
> nodes and record the choice per-proc; on fork, keep the new process on
> the same node
> 3) schedule threads on a CPU from the proc's node if at all possible
> (e.g, when a 6-core CPU is still 1 node), then on a "near" node from a
> list of distances sorted in order of cost
> 4) allocate new pages for a proc from its node's memory range(s) if at
> all possible.

One good source of information on this topic is IBM's AIX on the
Power 4 - 6 processors.  There is the concept of distant vs. close
memory and processors as well as what is referred to as memory
affinity.

Marc
--
Marc Wiz
[hidden email]
Yes, that really is my last name.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-smp
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: NUMA?

Ivan Voras
In reply to this post by Julian Elischer
Julian Elischer wrote:

> There are other interesting effects too..
>
> assigning network interrupts to processors that have good access to the
> hardware AND the destination if you can..

UMA also seems to be sensitive to topology. While at that, how do you
(if at all) deal with kernel memory allocations with respect to
topology? Things that have their own thread or process is easy but AFAIK
there is a lot of "thread-agnostic" code?




signature.asc (266 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: NUMA?

cara
This post has NOT been accepted by the mailing list yet.
The phrase Numa Numa is taken from the chorus of Dragostea din tei by the Moldovan band, O-Zone. In Romanian the lyrics are "nu mă, nu mă iei", which translate as "you don't, you don't take me". See Dragostea din tei: Lyrics for details and context.
You can get complete access to our [url=http://www.pass4sure.com/642-993.html]642-993[/url]; [url=http://www.pass4sure.com/HP0-D13.html]HP0-D13[/url] and [url=http://www.pass4sure.com/642-447.html]642-447[/url] practice exam pass resources from our website. It is also included our latest [url=http://www.pass4sure.com/642-742.html]642-742[/url] exam and [url=http://www.pass4sure.com/642-874.html]642-874[/url] dumps training courses for 77-882.