Strategy for PCI resource management (for supporting hot-plug)

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Strategy for PCI resource management (for supporting hot-plug)

Rajat Jain-5

Hi,

I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step
for the PCI-E hotplug support, I'm trying to decide on a resource
management / allocation strategy for the PCI memory / IO and the bus
numbers. Can you please comment on the following approach that I am
considering for resource allocation:

PROBLEM STATEMENT:
------------------
Given a memory range [A->B], IO range [C->D], and limited (256) bus
numbers, enumerate the PCI tree of a system, leaving enough "holes" in
between to allow addition of future devices.

PROPOSED STRATEGY:
------------------
1) When booting, start enumerating in a depth-first-search order. While
enumeration, always keep track of:

 * The next bus number (x) that can be allocated

 * The next Memory space pointer (A + y) starting which allocation can
be
   done. ("y" is the memory already allocated).

 * The next IO Space pointer (C + z) starting which allocation can be
done.
   ("z" is the IO space already allocated).

Keep incrementing the above as the resources are allocated.

2) Allocate bus numbers sequentially while traversing down from root to
a leaf node (end point). When going down traversing a bridge:

 * Allocate the next available bus number (x) to the secondary bus of
   bridge.

 * Temporarily mark the subordinate bridge as 0xFF (to allow discovery
of
   maximum buses).

 * Temporarily assign all the remaining available memory space to bridge

   [(A+x) -> B]. Ditto for IO space.

3) When a leaf node (End point) is reached, allocate the memory / IO
resource requested by the device, and increment the pointers.

4) While passing a bridge in the upward direction, tweak the bridge
registers such that its resources are ONLY ENOUGH to address the needs
of all the PCI tree below it, and if it has its own internal memory
mapped registers, some memory for it as well.

The above is the standard depth-first algorithm for resource allocation.
Here is the addition to support hot-plug:

At each bridge that supports hot-plug, in addition to the resources that
would have normally been allocated to this bridge, additionally
pre-allocate and assign to bridge (in anticipation of any new devices
that may be added later):

a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees
   present on the device plugged.

b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices
that
   may be attached later on.

c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may
be
   attached later on.

Please note that the above RSRVE* are constants defining the amount of
resources to be set aside for /below each HOT-PLUGGABLE bridge; their
values may be tweaked via a compile time option or via a sysctl.

FEW COMMENTS
------------
 
1) The strategy is fairly generic and tweak-able since it does not waste
a lot of resources (The developer neds to pick up a smart bvalue for
howmuch resources to reserve at each hot-pluggable slot):

   * The reservations shall be done only for hot-pluggable bridges

   * The developer can tweak the values (even disable it) for how much
     Resources shall be allocated for each hot-pluggable bridge.
   
2) One point of debate is what happens if there are too much resource
demands in the system (too many devices or the developer configures too
many resources to be allocated for each hot-pluggable devices). For e.g.
consider that while enumeration we find that all the resources are
already allocated, while there are more devices that need resources. So
do we simply do not enumerate them? Etc...

Overall, how does the above look?

Thanks & Best Regards,

Rajat Jain
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-new-bus
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Strategy for PCI resource management (for supporting hot-plug)

Warner Losh
From: Rajat Jain <[hidden email]>
Subject: Strategy for PCI resource management (for supporting hot-plug)
Date: Tue, 23 Feb 2010 12:46:40 +0530

>
> Hi,
>
> I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step
> for the PCI-E hotplug support, I'm trying to decide on a resource
> management / allocation strategy for the PCI memory / IO and the bus
> numbers. Can you please comment on the following approach that I am
> considering for resource allocation:
>
> PROBLEM STATEMENT:
> ------------------
> Given a memory range [A->B], IO range [C->D], and limited (256) bus
> numbers, enumerate the PCI tree of a system, leaving enough "holes" in
> between to allow addition of future devices.
>
> PROPOSED STRATEGY:
> ------------------
> 1) When booting, start enumerating in a depth-first-search order. While
> enumeration, always keep track of:
>
>  * The next bus number (x) that can be allocated
>
>  * The next Memory space pointer (A + y) starting which allocation can
> be
>    done. ("y" is the memory already allocated).
>
>  * The next IO Space pointer (C + z) starting which allocation can be
> done.
>    ("z" is the IO space already allocated).
>
> Keep incrementing the above as the resources are allocated.

IO space and memory space are bus addresses, which may have a mapping
to another domain.

> 2) Allocate bus numbers sequentially while traversing down from root to
> a leaf node (end point). When going down traversing a bridge:
>
>  * Allocate the next available bus number (x) to the secondary bus of
>    bridge.
>
>  * Temporarily mark the subordinate bridge as 0xFF (to allow discovery
> of
>    maximum buses).
>
>  * Temporarily assign all the remaining available memory space to bridge
>
>    [(A+x) -> B]. Ditto for IO space.

I'm sure this is wise.

> 3) When a leaf node (End point) is reached, allocate the memory / IO
> resource requested by the device, and increment the pointers.

keep in mind that devices may not have drivers allocataed to them at
bus enumeration of time.  with hot-plug devices, you might not even
know all the devices that are there or could be there.

> 4) While passing a bridge in the upward direction, tweak the bridge
> registers such that its resources are ONLY ENOUGH to address the needs
> of all the PCI tree below it, and if it has its own internal memory
> mapped registers, some memory for it as well.

How does one deal with adding a device that has a bridge on it?  I
think that the only enough part is likely going to lead to prroblems
as you'll need to move other resources if a new device arrives here.

> The above is the standard depth-first algorithm for resource allocation.
> Here is the addition to support hot-plug:

the above won't quite work for cardbus :)  But that's a hot-plug
device...

> At each bridge that supports hot-plug, in addition to the resources that
> would have normally been allocated to this bridge, additionally
> pre-allocate and assign to bridge (in anticipation of any new devices
> that may be added later):

In addition, or total?  if it were total, you could more easily
allocate memory or io space ranges in a more determnistic way when you
have to deal with booting with or without a device that's present.

> a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees
>    present on the device plugged.

This one might make sense, but if we have multiple levels then you'll
run out.  if you have 4 additional bridges, you can't allocate X
additional busses at the root, then you can only (X-4)/4 at each
level.

> b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices
> that
>    may be attached later on.
>
> c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may
> be
>    attached later on.

similar comments apply here.

> Please note that the above RSRVE* are constants defining the amount of
> resources to be set aside for /below each HOT-PLUGGABLE bridge; their
> values may be tweaked via a compile time option or via a sysctl.
>
> FEW COMMENTS
> ------------
>  
> 1) The strategy is fairly generic and tweak-able since it does not waste
> a lot of resources (The developer neds to pick up a smart bvalue for
> howmuch resources to reserve at each hot-pluggable slot):
>
>    * The reservations shall be done only for hot-pluggable bridges
>
>    * The developer can tweak the values (even disable it) for how much
>      Resources shall be allocated for each hot-pluggable bridge.

I'd like to understand the details of this better.  especially when
you have multiple layers where devices that have bridges are
hot-plugged into the system.

For example, three's a cardbus to pci bridge, which has 3 PCI slots
behind it.  These slots may have, say, a quad ethernet card which has
a pci bridge to allow the 4 pci nics behind it.  New while this
example may be dated, newer pci-e also allows for it...

> 2) One point of debate is what happens if there are too much resource
> demands in the system (too many devices or the developer configures too
> many resources to be allocated for each hot-pluggable devices). For e.g.
> consider that while enumeration we find that all the resources are
> already allocated, while there are more devices that need resources. So
> do we simply do not enumerate them? Etc...

How is this different than normal resource failure?  And how will you
know at initial enumearation what devices will be plugged in?

> Overall, how does the above look?

In general, it looks fairly good.  I'm just worried about the multiple
layer case :)

Warner

> Thanks & Best Regards,
>
> Rajat Jain
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-new-bus
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Strategy for PCI resource management (for supporting hot-plug)

Attilio Rao-2
In reply to this post by Rajat Jain-5
2010/2/23 Rajat Jain <[hidden email]>:
>
> Hi,
>
> I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step
> for the PCI-E hotplug support, I'm trying to decide on a resource
> management / allocation strategy for the PCI memory / IO and the bus
> numbers. Can you please comment on the following approach that I am
> considering for resource allocation:

You may also coordinate with jhb@ which is working on a multipass
layer for improving resource mapping/allocation.

Thanks,
Attilio


--
Peace can only be achieved by understanding - A. Einstein
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-new-bus
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Strategy for PCI resource management (for supporting hot-plug)

John Baldwin
In reply to this post by Rajat Jain-5
On Tuesday 23 February 2010 2:16:40 am Rajat Jain wrote:

>
> Hi,
>
> I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step
> for the PCI-E hotplug support, I'm trying to decide on a resource
> management / allocation strategy for the PCI memory / IO and the bus
> numbers. Can you please comment on the following approach that I am
> considering for resource allocation:
>
> PROBLEM STATEMENT:
> ------------------
> Given a memory range [A->B], IO range [C->D], and limited (256) bus
> numbers, enumerate the PCI tree of a system, leaving enough "holes" in
> between to allow addition of future devices.
>
> PROPOSED STRATEGY:
> ------------------
> 1) When booting, start enumerating in a depth-first-search order. While
> enumeration, always keep track of:
>
>  * The next bus number (x) that can be allocated
>
>  * The next Memory space pointer (A + y) starting which allocation can
> be
>    done. ("y" is the memory already allocated).
>
>  * The next IO Space pointer (C + z) starting which allocation can be
> done.
>    ("z" is the IO space already allocated).
>
> Keep incrementing the above as the resources are allocated.
>
> 2) Allocate bus numbers sequentially while traversing down from root to
> a leaf node (end point). When going down traversing a bridge:
>
>  * Allocate the next available bus number (x) to the secondary bus of
>    bridge.
>
>  * Temporarily mark the subordinate bridge as 0xFF (to allow discovery
> of
>    maximum buses).
>
>  * Temporarily assign all the remaining available memory space to bridge
>
>    [(A+x) -> B]. Ditto for IO space.
>
> 3) When a leaf node (End point) is reached, allocate the memory / IO
> resource requested by the device, and increment the pointers.
>
> 4) While passing a bridge in the upward direction, tweak the bridge
> registers such that its resources are ONLY ENOUGH to address the needs
> of all the PCI tree below it, and if it has its own internal memory
> mapped registers, some memory for it as well.
>
> The above is the standard depth-first algorithm for resource allocation.
> Here is the addition to support hot-plug:
>
> At each bridge that supports hot-plug, in addition to the resources that
> would have normally been allocated to this bridge, additionally
> pre-allocate and assign to bridge (in anticipation of any new devices
> that may be added later):
>
> a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees
>    present on the device plugged.
>
> b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices
> that
>    may be attached later on.
>
> c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may
> be
>    attached later on.
>
> Please note that the above RSRVE* are constants defining the amount of
> resources to be set aside for /below each HOT-PLUGGABLE bridge; their
> values may be tweaked via a compile time option or via a sysctl.
>
> FEW COMMENTS
> ------------
>  
> 1) The strategy is fairly generic and tweak-able since it does not waste
> a lot of resources (The developer neds to pick up a smart bvalue for
> howmuch resources to reserve at each hot-pluggable slot):
>
>    * The reservations shall be done only for hot-pluggable bridges
>
>    * The developer can tweak the values (even disable it) for how much
>      Resources shall be allocated for each hot-pluggable bridge.
>    
> 2) One point of debate is what happens if there are too much resource
> demands in the system (too many devices or the developer configures too
> many resources to be allocated for each hot-pluggable devices). For e.g.
> consider that while enumeration we find that all the resources are
> already allocated, while there are more devices that need resources. So
> do we simply do not enumerate them? Etc...
>
> Overall, how does the above look?

I think one wrinkle is that we should try to preserve the resources that the
firmware has set for devices, at least on x86.  I had also wanted to make use
of multipass for this, but that requires a bit more work to split the PCI
bus attach up into separate steps.

--
John Baldwin
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-new-bus
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Re: Strategy for PCI resource management (for supporting hot-plug)

Sergey Babkin
In reply to this post by Rajat Jain-5
(Sorry, if the email comes out looking weird, I want to give another try to see if the provider has
fixed the formatting issues i nthe web interface or not).

On Tuesday 23 February 2010 2:16:40 am Rajat Jain wrote:

>
> Hi,
>
> I'm trying to add PCI-E hotplug support to the FreeBSD. As a first step
> for the PCI-E hotplug support, I'm trying to decide on a resource
> management / allocation strategy for the PCI memory / IO and the bus
> numbers. Can you please comment on the following approach that I am
> considering for resource allocation:
>
> PROBLEM STATEMENT:
> ------------------
> Given a memory range [A->B], IO range [C->D], and limited (256) bus
> numbers, enumerate the PCI tree of a system, leaving enough "holes" in
> between to allow addition of future devices.
>
> PROPOSED STRATEGY:
> ------------------
> 1) When booting, start enumerating in a depth-first-search order. While
> enumeration, always keep track of:
>
>  * The next bus number (x) that can be allocated
>
>  * The next Memory space pointer (A + y) starting which allocation can
> be
>    done. ("y" is the memory already allocated).
>
>  * The next IO Space pointer (C + z) starting which allocation can be
> done.
>    ("z" is the IO space already allocated).
>
> Keep incrementing the above as the resources are allocated.
>
> 2) Allocate bus numbers sequentially while traversing down from root to
> a leaf node (end point). When going down traversing a bridge:
>
>  * Allocate the next available bus number (x) to the secondary bus of
>    bridge.
>
>  * Temporarily mark the subordinate bridge as 0xFF (to allow discovery
> of
>    maximum buses).
>
>  * Temporarily assign all the remaining available memory space to bridge
>
>    [(A+x) -> B]. Ditto for IO space.
>
> 3) When a leaf node (End point) is reached, allocate the memory / IO
> resource requested by the device, and increment the pointers.
>
> 4) While passing a bridge in the upward direction, tweak the bridge
> registers such that its resources are ONLY ENOUGH to address the needs
> of all the PCI tree below it, and if it has its own internal memory
> mapped registers, some memory for it as well.
>
> The above is the standard depth-first algorithm for resource allocation.
> Here is the addition to support hot-plug:
>
> At each bridge that supports hot-plug, in addition to the resources that
> would have normally been allocated to this bridge, additionally
> pre-allocate and assign to bridge (in anticipation of any new devices
> that may be added later):
>
> a) "RSRVE_NUM_BUS" number of busses, to cater to any bridges, PCI trees
>    present on the device plugged.
>
> b) "RSRVE_MEM" amount of memory space, to cater to all the PCI devices
> that
>    may be attached later on.
>
> c) "RESRVE_IO" amount of IO space, to cater to all PCI devices that may
> be
>    attached later on.

A kind of stupid question: should the reserve amounts depend on the level of the bridge?
Perhaps the priidges closer to the root should get more reserves. Perhaps it doesn't
matter so much durin gthe initial enumeration but ma matter latter after a hot plug.

Suppose we have the Bridge B1 that gets RSRVE resources  attached to it during
the initial enumeration. Then someone comes and hot-plugs a bridge B2 under B1.
B2 then I guess will also try to get a reserve of RSRVE resources for itself, so it would
take the whole original reserve of B1 to itself. If someone comes later and tries
to hot-plug another bridge B3 under B1, that bridge would not get any resources
and the plugging would fail.

-SB
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-new-bus
To unsubscribe, send any mail to "[hidden email]"