Large virtual page size support.

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Large virtual page size support.

Jeff Roberson
I have implemented support in the vm for PAGE_SIZE values which are a
multiple of the hardware page size.  This is primarily useful for two
things:

1) Shrinking the size of the vm page array so that very large memory x86
PAE machines may boot.

2) Improving performance of many operations due to decreased page list
sizes as well as improved efficiency of many vm operations.  In the
particular application that this was developed for the fs block size, page
size, and jumbo frame size were all made equal at 8k on a box with 4k
pages.  This made page flipping etc. very fast.

This has been done with full userland backwards compatibility.  Userland
still has the ability to map things in native page size chunks.  The
majority of the system software remains unchanged.  The vm gains some
complexity and the elf loader gains some complexity since both need to be
able to deal with native page size and virtual page size.

The real page size is now CPU_PAGE_SIZE while PAGE_SIZE is the virtual
page size which is the smallest unit of memory handed back by the page
allocation routines.  KVA is also managed in PAGE_SIZE chunks.  The x86
pmap code has a small allocator that deals with allocating real pages for
page table entries.

I wrote this code for a client who would like for it to be in the freebsd
tree.  However, it does add some complexity and so I doubt freebsd wants
it unless there is a clear demand for it.  What I'd like to know is, does
anyone else find this useful?  Do the developers who work on the vm think
this is just a horrible hack?  Does anyone care about PAE anymore?

Let me know what you think.  The patch is available at
http://www.chesapeake.net/~jroberson/8k.diff.  It will not apply to any
version of freebsd that you have.  Please consider it read only and not
testable until I decide whether it's worth porting.

Cheers,
Jeff
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Poul-Henning Kamp
In message <20060117002541.I602@10.0.0.1>, Jeff Roberson writes:
>I have implemented support in the vm for PAGE_SIZE values which are a
>multiple of the hardware page size.  This is primarily useful for two
>things:

Sounds like a good thing to me.

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
[hidden email]         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Julian Elischer
In reply to this post by Jeff Roberson
Jeff Roberson wrote:

> I have implemented support in the vm for PAGE_SIZE values which are a
> multiple of the hardware page size.  This is primarily useful for two
> things:


Mach (and the VM system we inherrited from it) had this. I beieve it was
removed with teh comment
"If we need this and someone is willing to support it it can be added
back" .

It always seemed like in interesting idea and I'm happy to see that it
is still being looked at.

>
> 1) Shrinking the size of the vm page array so that very large memory
> x86 PAE machines may boot.
>
> 2) Improving performance of many operations due to decreased page list
> sizes as well as improved efficiency of many vm operations.  In the
> particular application that this was developed for the fs block size,
> page size, and jumbo frame size were all made equal at 8k on a box
> with 4k pages.  This made page flipping etc. very fast.
>
> This has been done with full userland backwards compatibility.  
> Userland still has the ability to map things in native page size
> chunks.  The majority of the system software remains unchanged.  The
> vm gains some complexity and the elf loader gains some complexity
> since both need to be able to deal with native page size and virtual
> page size.
>
> The real page size is now CPU_PAGE_SIZE while PAGE_SIZE is the virtual
> page size which is the smallest unit of memory handed back by the page
> allocation routines.  KVA is also managed in PAGE_SIZE chunks.  The
> x86 pmap code has a small allocator that deals with allocating real
> pages for page table entries.
>
> I wrote this code for a client who would like for it to be in the
> freebsd tree.  However, it does add some complexity and so I doubt
> freebsd wants it unless there is a clear demand for it.  What I'd like
> to know is, does anyone else find this useful?  Do the developers who
> work on the vm think this is just a horrible hack?  Does anyone care
> about PAE anymore?
>
> Let me know what you think.  The patch is available at
> http://www.chesapeake.net/~jroberson/8k.diff.  It will not apply to
> any version of freebsd that you have.  Please consider it read only
> and not testable until I decide whether it's worth porting.
>
> Cheers,
> Jeff
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-arch
> To unsubscribe, send any mail to "[hidden email]"

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Julian Elischer
Julian Elischer wrote:

> Jeff Roberson wrote:
>
>> I have implemented support in the vm for PAGE_SIZE values which are a
>> multiple of the hardware page size.  This is primarily useful for two
>> things:
>
>
>
> Mach (and the VM system we inherrited from it) had this. I beieve it
> was removed with teh comment
> "If we need this and someone is willing to support it it can be added
> back" .


I can't see any record of it in our CVS files but I distinctly remember
it in MACH. Not sure when it was
removed.

>
> It always seemed like in interesting idea and I'm happy to see that it
> is still being looked at.
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Poul-Henning Kamp
In reply to this post by Julian Elischer
In message <[hidden email]>, Julian Elischer writes:

>Jeff Roberson wrote:
>
>> I have implemented support in the vm for PAGE_SIZE values which are a
>> multiple of the hardware page size.  This is primarily useful for two
>> things:
>
>Mach (and the VM system we inherrited from it) had this. I beieve it was
>removed with teh comment
>"If we need this and someone is willing to support it it can be added
>back" .

It was a VAX artifact and not very usable.  I belive we have a couple
of comments and macros which still talk about "clicks".

--
Poul-Henning Kamp       | UNIX since Zilog Zeus 3.20
[hidden email]         | TCP/IP since RFC 956
FreeBSD committer       | BSD since 4.3-tahoe    
Never attribute to malice what can adequately be explained by incompetence.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Alan Cox-5
On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:

> In message <[hidden email]>, Julian Elischer writes:
> >Jeff Roberson wrote:
> >
> >> I have implemented support in the vm for PAGE_SIZE values which are a
> >> multiple of the hardware page size.  This is primarily useful for two
> >> things:
> >
> >Mach (and the VM system we inherrited from it) had this. I beieve it was
> >removed with teh comment
> >"If we need this and someone is willing to support it it can be added
> >back" .
>
> It was a VAX artifact and not very usable.  I belive we have a couple
> of comments and macros which still talk about "clicks".
>

It was not a VAX artifact.  Also, Jeff's work differs significantly
from what Mach did on the VAX.  I'll explain the differences after
my paper deadline passes tomorrow.

Alan
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Peter Jeremy
In reply to this post by Jeff Roberson
On Tue, 2006-Jan-17 00:42:57 -0800, Jeff Roberson wrote:
>I have implemented support in the vm for PAGE_SIZE values which are a
>multiple of the hardware page size.  This is primarily useful for two
>things:
>
>1) Shrinking the size of the vm page array so that very large memory x86
>PAE machines may boot.
>
>2) Improving performance of many operations due to decreased page list
>sizes as well as improved efficiency of many vm operations.  In the

I presume this would also reduce the KVA size on non-PAE large memory
i386 systems.  The code looks quite clean - I'd say it's a worthwhile
addition to FreeBSD.

--
Peter Jeremy
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Alan Cox-5
In reply to this post by Poul-Henning Kamp
On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:

> In message <[hidden email]>, Julian Elischer writes:
> >Jeff Roberson wrote:
> >
> >> I have implemented support in the vm for PAGE_SIZE values which are a
> >> multiple of the hardware page size.  This is primarily useful for two
> >> things:
> >
> >Mach (and the VM system we inherrited from it) had this. I beieve it was
> >removed with teh comment
> >"If we need this and someone is willing to support it it can be added
> >back" .
>
> It was a VAX artifact and not very usable.  I belive we have a couple
> of comments and macros which still talk about "clicks".

Like Jeff's patch, Mach's VM design allowed for two distinct page
sizes, one being the native, hardware page size and the other being a
larger, abstract page size.  The essential difference between Jeff's
patch and what Mach did on the VAX is that Mach's use of the native,
hardware page size was entirely within the pmap and locore-level code.
For example, the hardware-supported page size on the VAX was 512
bytes.  However, as far as the machine-independent layer of the Mach
kernel was concerned the page size was 4K bytes.  This included the
machine-independent part of the virtual memory system; it too believed
that the page size was 4K bytes.  As a consequences, the granularity
of mappings and protection was 4K bytes.  Finally, there was nothing
VAX-specific about the design and implementation of this feature.
However, I don't recall any other pmap implementations having
different native and abstract page sizes.  Today, I speculate that you
could implement a distinct native and abstract page size on the sparc
because different versions of processor have had different page sizes.
Consequently, the ABI documents that I've seen don't specify a
particular page size only that 64K bytes is the largest that a page
will ever be; to learn the precise page size, they say that you must
call the OS at run time.  So, you could use a larger abstract page
without breaking the ABI.

In constrast, Jeff's patch has both the machine-dependent and
machine-independent layers knowing about both page sizes.  Moreover,
the granularity of mappings and protection is still the native,
hardware page size.  In other words, within the vm_map the page size
is the native, hardware page size, but over in the vm_object the page
size is the larger, abstract size.  (Reread the last sentence again
before continuing.)  As you can imagine, this is a lot trickier to get
right in the first place and maintain in the long run than what Mach
did.  This is why Jeff is being so circumspect about committing this
work.  Other the hand, it offers essentially the same benefits as what
Mach did without breaking the i386 ABI.

Alan
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Julian Elischer
Alan Cox wrote:

>On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>  
>
>>In message <[hidden email]>, Julian Elischer writes:
>>    
>>
>>>Jeff Roberson wrote:
>>>
>>>      
>>>
>>>>I have implemented support in the vm for PAGE_SIZE values which are a
>>>>multiple of the hardware page size.  This is primarily useful for two
>>>>things:
>>>>        
>>>>
>>>Mach (and the VM system we inherrited from it) had this. I beieve it was
>>>removed with teh comment
>>>"If we need this and someone is willing to support it it can be added
>>>back" .
>>>      
>>>
>>It was a VAX artifact and not very usable.  I belive we have a couple
>>of comments and macros which still talk about "clicks".
>>    
>>
>
>Like Jeff's patch, Mach's VM design allowed for two distinct page
>sizes, one being the native, hardware page size and the other being a
>larger, abstract page size.  The essential difference between Jeff's
>patch and what Mach did on the VAX is that Mach's use of the native,
>hardware page size was entirely within the pmap and locore-level code.
>For example, the hardware-supported page size on the VAX was 512
>bytes.  However, as far as the machine-independent layer of the Mach
>kernel was concerned the page size was 4K bytes.  This included the
>machine-independent part of the virtual memory system; it too believed
>that the page size was 4K bytes.  As a consequences, the granularity
>of mappings and protection was 4K bytes.  Finally, there was nothing
>VAX-specific about the design and implementation of this feature.
>However, I don't recall any other pmap implementations having
>different native and abstract page sizes.  Today, I speculate that you
>could implement a distinct native and abstract page size on the sparc
>because different versions of processor have had different page sizes.
>Consequently, the ABI documents that I've seen don't specify a
>particular page size only that 64K bytes is the largest that a page
>will ever be; to learn the precise page size, they say that you must
>call the OS at run time.  So, you could use a larger abstract page
>without breaking the ABI.
>
>In constrast, Jeff's patch has both the machine-dependent and
>machine-independent layers knowing about both page sizes.  Moreover,
>the granularity of mappings and protection is still the native,
>hardware page size.  In other words, within the vm_map the page size
>is the native, hardware page size, but over in the vm_object the page
>size is the larger, abstract size.  (Reread the last sentence again
>before continuing.)  As you can imagine, this is a lot trickier to get
>right in the first place and maintain in the long run than what Mach
>did.  This is why Jeff is being so circumspect about committing this
>work.  Other the hand, it offers essentially the same benefits as what
>Mach did without breaking the i386 ABI.
>  
>

was this the reason that it was done in a different way?
What was the reason to not do it entirely in the pmap layer (e.g. Mach).
I know hte Maxh people were very proud of their implementation. It
always appeared in their technical descriptions.

The phrase "this is a lot trickier to [...] maintain in the long run"
worries me..   There must be a reason to not go with the simpler approach..
What was it?

>Alan
>  
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Jeff Roberson
On Fri, 20 Jan 2006, Julian Elischer wrote:

> Alan Cox wrote:
>
>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>>
>>> In message <[hidden email]>, Julian Elischer writes:
>>>
>>>> Jeff Roberson wrote:
>>>>
>>>>
>>>>> I have implemented support in the vm for PAGE_SIZE values which are a
>>>>> multiple of the hardware page size.  This is primarily useful for two
>>>>> things:
>>>>>
>>>> Mach (and the VM system we inherrited from it) had this. I beieve it was
>>>> removed with teh comment
>>>> "If we need this and someone is willing to support it it can be added
>>>> back" .
>>>>
>>> It was a VAX artifact and not very usable.  I belive we have a couple
>>> of comments and macros which still talk about "clicks".
>>>
>>
>> Like Jeff's patch, Mach's VM design allowed for two distinct page
>> sizes, one being the native, hardware page size and the other being a
>> larger, abstract page size.  The essential difference between Jeff's
>> patch and what Mach did on the VAX is that Mach's use of the native,
>> hardware page size was entirely within the pmap and locore-level code.
>> For example, the hardware-supported page size on the VAX was 512
>> bytes.  However, as far as the machine-independent layer of the Mach
>> kernel was concerned the page size was 4K bytes.  This included the
>> machine-independent part of the virtual memory system; it too believed
>> that the page size was 4K bytes.  As a consequences, the granularity
>> of mappings and protection was 4K bytes.  Finally, there was nothing
>> VAX-specific about the design and implementation of this feature.
>> However, I don't recall any other pmap implementations having
>> different native and abstract page sizes.  Today, I speculate that you
>> could implement a distinct native and abstract page size on the sparc
>> because different versions of processor have had different page sizes.
>> Consequently, the ABI documents that I've seen don't specify a
>> particular page size only that 64K bytes is the largest that a page
>> will ever be; to learn the precise page size, they say that you must
>> call the OS at run time.  So, you could use a larger abstract page
>> without breaking the ABI.
>>
>> In constrast, Jeff's patch has both the machine-dependent and
>> machine-independent layers knowing about both page sizes.  Moreover,
>> the granularity of mappings and protection is still the native,
>> hardware page size.  In other words, within the vm_map the page size
>> is the native, hardware page size, but over in the vm_object the page
>> size is the larger, abstract size.  (Reread the last sentence again
>> before continuing.)  As you can imagine, this is a lot trickier to get
>> right in the first place and maintain in the long run than what Mach
>> did.  This is why Jeff is being so circumspect about committing this
>> work.  Other the hand, it offers essentially the same benefits as what
>> Mach did without breaking the i386 ABI.
>>
>
> was this the reason that it was done in a different way?
> What was the reason to not do it entirely in the pmap layer (e.g. Mach).
> I know hte Maxh people were very proud of their implementation. It
> always appeared in their technical descriptions.
>
> The phrase "this is a lot trickier to [...] maintain in the long run"
> worries me..   There must be a reason to not go with the simpler approach..
> What was it?

It doesn't maintain backwards compatibility.  I originally implemented it
in the mach way, but you have to recompile the entire system with the
larger page size.  This patch grew the MI parts to support existing
binaries.

It is complex.  I was hoping for someone to chime in and say "That's
great, we need that" or "No, that's not useful at all".  Unfortunately,
the response is somewhere in the middle.  I guess the best course is to
port it forward and test it on some x86 machines and see if it makes a big
difference.

Cheers,
Jeff

>
>> Alan
>>
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Scott Long-2
Jeff Roberson wrote:

> On Fri, 20 Jan 2006, Julian Elischer wrote:
>
>> Alan Cox wrote:
>>
>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>>>
>>>> In message <[hidden email]>, Julian Elischer writes:
>>>>
>>>>> Jeff Roberson wrote:
>>>>>
>>>>>
>>>>>> I have implemented support in the vm for PAGE_SIZE values which
>>>>>> are a multiple of the hardware page size.  This is primarily
>>>>>> useful for two things:
>>>>>>
>>>>> Mach (and the VM system we inherrited from it) had this. I beieve
>>>>> it was removed with teh comment
>>>>> "If we need this and someone is willing to support it it can be
>>>>> added back" .
>>>>>
>>>> It was a VAX artifact and not very usable.  I belive we have a couple
>>>> of comments and macros which still talk about "clicks".
>>>>
>>>
>>> Like Jeff's patch, Mach's VM design allowed for two distinct page
>>> sizes, one being the native, hardware page size and the other being a
>>> larger, abstract page size.  The essential difference between Jeff's
>>> patch and what Mach did on the VAX is that Mach's use of the native,
>>> hardware page size was entirely within the pmap and locore-level code.
>>> For example, the hardware-supported page size on the VAX was 512
>>> bytes.  However, as far as the machine-independent layer of the Mach
>>> kernel was concerned the page size was 4K bytes.  This included the
>>> machine-independent part of the virtual memory system; it too believed
>>> that the page size was 4K bytes.  As a consequences, the granularity
>>> of mappings and protection was 4K bytes.  Finally, there was nothing
>>> VAX-specific about the design and implementation of this feature.
>>> However, I don't recall any other pmap implementations having
>>> different native and abstract page sizes.  Today, I speculate that you
>>> could implement a distinct native and abstract page size on the sparc
>>> because different versions of processor have had different page sizes.
>>> Consequently, the ABI documents that I've seen don't specify a
>>> particular page size only that 64K bytes is the largest that a page
>>> will ever be; to learn the precise page size, they say that you must
>>> call the OS at run time.  So, you could use a larger abstract page
>>> without breaking the ABI.
>>>
>>> In constrast, Jeff's patch has both the machine-dependent and
>>> machine-independent layers knowing about both page sizes.  Moreover,
>>> the granularity of mappings and protection is still the native,
>>> hardware page size.  In other words, within the vm_map the page size
>>> is the native, hardware page size, but over in the vm_object the page
>>> size is the larger, abstract size.  (Reread the last sentence again
>>> before continuing.)  As you can imagine, this is a lot trickier to get
>>> right in the first place and maintain in the long run than what Mach
>>> did.  This is why Jeff is being so circumspect about committing this
>>> work.  Other the hand, it offers essentially the same benefits as what
>>> Mach did without breaking the i386 ABI.
>>>
>>
>> was this the reason that it was done in a different way?
>> What was the reason to not do it entirely in the pmap layer (e.g. Mach).
>> I know hte Maxh people were very proud of their implementation. It
>> always appeared in their technical descriptions.
>>
>> The phrase "this is a lot trickier to [...] maintain in the long run"
>> worries me..   There must be a reason to not go with the simpler
>> approach..
>> What was it?
>
>
> It doesn't maintain backwards compatibility.  I originally implemented
> it in the mach way, but you have to recompile the entire system with the
> larger page size.  This patch grew the MI parts to support existing
> binaries.
>
> It is complex.  I was hoping for someone to chime in and say "That's
> great, we need that" or "No, that's not useful at all".  Unfortunately,
> the response is somewhere in the middle.  I guess the best course is to
> port it forward and test it on some x86 machines and see if it makes a
> big difference.
>
> Cheers,
> Jeff
>

Yes, we need that.  Please commit =-)

Scott
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Jeff Roberson
On Sun, 22 Jan 2006, Scott Long wrote:

> Jeff Roberson wrote:
>> On Fri, 20 Jan 2006, Julian Elischer wrote:
>>
>>> Alan Cox wrote:
>>>
>>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>>>>
>>>>> In message <[hidden email]>, Julian Elischer writes:
>>>>>
>>>>>> Jeff Roberson wrote:
>>>>>>
>>>>>>
>>>>>>> I have implemented support in the vm for PAGE_SIZE values which are a
>>>>>>> multiple of the hardware page size.  This is primarily useful for two
>>>>>>> things:
>>>>>>>
>>>>>> Mach (and the VM system we inherrited from it) had this. I beieve it
>>>>>> was removed with teh comment
>>>>>> "If we need this and someone is willing to support it it can be added
>>>>>> back" .
>>>>>>
>>>>> It was a VAX artifact and not very usable.  I belive we have a couple
>>>>> of comments and macros which still talk about "clicks".
>>>>>
>>>>
>>>> Like Jeff's patch, Mach's VM design allowed for two distinct page
>>>> sizes, one being the native, hardware page size and the other being a
>>>> larger, abstract page size.  The essential difference between Jeff's
>>>> patch and what Mach did on the VAX is that Mach's use of the native,
>>>> hardware page size was entirely within the pmap and locore-level code.
>>>> For example, the hardware-supported page size on the VAX was 512
>>>> bytes.  However, as far as the machine-independent layer of the Mach
>>>> kernel was concerned the page size was 4K bytes.  This included the
>>>> machine-independent part of the virtual memory system; it too believed
>>>> that the page size was 4K bytes.  As a consequences, the granularity
>>>> of mappings and protection was 4K bytes.  Finally, there was nothing
>>>> VAX-specific about the design and implementation of this feature.
>>>> However, I don't recall any other pmap implementations having
>>>> different native and abstract page sizes.  Today, I speculate that you
>>>> could implement a distinct native and abstract page size on the sparc
>>>> because different versions of processor have had different page sizes.
>>>> Consequently, the ABI documents that I've seen don't specify a
>>>> particular page size only that 64K bytes is the largest that a page
>>>> will ever be; to learn the precise page size, they say that you must
>>>> call the OS at run time.  So, you could use a larger abstract page
>>>> without breaking the ABI.
>>>>
>>>> In constrast, Jeff's patch has both the machine-dependent and
>>>> machine-independent layers knowing about both page sizes.  Moreover,
>>>> the granularity of mappings and protection is still the native,
>>>> hardware page size.  In other words, within the vm_map the page size
>>>> is the native, hardware page size, but over in the vm_object the page
>>>> size is the larger, abstract size.  (Reread the last sentence again
>>>> before continuing.)  As you can imagine, this is a lot trickier to get
>>>> right in the first place and maintain in the long run than what Mach
>>>> did.  This is why Jeff is being so circumspect about committing this
>>>> work.  Other the hand, it offers essentially the same benefits as what
>>>> Mach did without breaking the i386 ABI.
>>>>
>>>
>>> was this the reason that it was done in a different way?
>>> What was the reason to not do it entirely in the pmap layer (e.g. Mach).
>>> I know hte Maxh people were very proud of their implementation. It
>>> always appeared in their technical descriptions.
>>>
>>> The phrase "this is a lot trickier to [...] maintain in the long run"
>>> worries me..   There must be a reason to not go with the simpler
>>> approach..
>>> What was it?
>>
>>
>> It doesn't maintain backwards compatibility.  I originally implemented it
>> in the mach way, but you have to recompile the entire system with the
>> larger page size.  This patch grew the MI parts to support existing
>> binaries.
>>
>> It is complex.  I was hoping for someone to chime in and say "That's great,
>> we need that" or "No, that's not useful at all".  Unfortunately, the
>> response is somewhere in the middle.  I guess the best course is to port it
>> forward and test it on some x86 machines and see if it makes a big
>> difference.
>>
>> Cheers,
>> Jeff
>>
>
> Yes, we need that.  Please commit =-)

Thanks for the encouragement.  There are a few unresolved issues.  Most
importantly, how are we presently dealing with config options that break
modules?  If the option is to stay as it is, modules will have to be aware
of the page size that is agreed upon by the rest of the kernel.

There other issues are mostly considering ways to reduce the impact of the
patch on the rest of the system.  How to tidy it up a bit more, if it can
be.

>
> Scott
>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Scott Long-2
Jeff Roberson wrote:

> On Sun, 22 Jan 2006, Scott Long wrote:
>
>> Jeff Roberson wrote:
>>
>>> On Fri, 20 Jan 2006, Julian Elischer wrote:
>>>
>>>> Alan Cox wrote:
>>>>
>>>>> On Tue, Jan 17, 2006 at 11:18:56PM +0100, Poul-Henning Kamp wrote:
>>>>>
>>>>>> In message <[hidden email]>, Julian Elischer writes:
>>>>>>
>>>>>>> Jeff Roberson wrote:
>>>>>>>
>>>>>>>
>>>>>>>> I have implemented support in the vm for PAGE_SIZE values which
>>>>>>>> are a multiple of the hardware page size.  This is primarily
>>>>>>>> useful for two things:
>>>>>>>>
>>>>>>> Mach (and the VM system we inherrited from it) had this. I beieve
>>>>>>> it was removed with teh comment
>>>>>>> "If we need this and someone is willing to support it it can be
>>>>>>> added back" .
>>>>>>>
>>>>>> It was a VAX artifact and not very usable.  I belive we have a couple
>>>>>> of comments and macros which still talk about "clicks".
>>>>>>
>>>>>
>>>>> Like Jeff's patch, Mach's VM design allowed for two distinct page
>>>>> sizes, one being the native, hardware page size and the other being a
>>>>> larger, abstract page size.  The essential difference between Jeff's
>>>>> patch and what Mach did on the VAX is that Mach's use of the native,
>>>>> hardware page size was entirely within the pmap and locore-level code.
>>>>> For example, the hardware-supported page size on the VAX was 512
>>>>> bytes.  However, as far as the machine-independent layer of the Mach
>>>>> kernel was concerned the page size was 4K bytes.  This included the
>>>>> machine-independent part of the virtual memory system; it too believed
>>>>> that the page size was 4K bytes.  As a consequences, the granularity
>>>>> of mappings and protection was 4K bytes.  Finally, there was nothing
>>>>> VAX-specific about the design and implementation of this feature.
>>>>> However, I don't recall any other pmap implementations having
>>>>> different native and abstract page sizes.  Today, I speculate that you
>>>>> could implement a distinct native and abstract page size on the sparc
>>>>> because different versions of processor have had different page sizes.
>>>>> Consequently, the ABI documents that I've seen don't specify a
>>>>> particular page size only that 64K bytes is the largest that a page
>>>>> will ever be; to learn the precise page size, they say that you must
>>>>> call the OS at run time.  So, you could use a larger abstract page
>>>>> without breaking the ABI.
>>>>>
>>>>> In constrast, Jeff's patch has both the machine-dependent and
>>>>> machine-independent layers knowing about both page sizes.  Moreover,
>>>>> the granularity of mappings and protection is still the native,
>>>>> hardware page size.  In other words, within the vm_map the page size
>>>>> is the native, hardware page size, but over in the vm_object the page
>>>>> size is the larger, abstract size.  (Reread the last sentence again
>>>>> before continuing.)  As you can imagine, this is a lot trickier to get
>>>>> right in the first place and maintain in the long run than what Mach
>>>>> did.  This is why Jeff is being so circumspect about committing this
>>>>> work.  Other the hand, it offers essentially the same benefits as what
>>>>> Mach did without breaking the i386 ABI.
>>>>>
>>>>
>>>> was this the reason that it was done in a different way?
>>>> What was the reason to not do it entirely in the pmap layer (e.g.
>>>> Mach).
>>>> I know hte Maxh people were very proud of their implementation. It
>>>> always appeared in their technical descriptions.
>>>>
>>>> The phrase "this is a lot trickier to [...] maintain in the long run"
>>>> worries me..   There must be a reason to not go with the simpler
>>>> approach..
>>>> What was it?
>>>
>>>
>>>
>>> It doesn't maintain backwards compatibility.  I originally
>>> implemented it in the mach way, but you have to recompile the entire
>>> system with the larger page size.  This patch grew the MI parts to
>>> support existing binaries.
>>>
>>> It is complex.  I was hoping for someone to chime in and say "That's
>>> great, we need that" or "No, that's not useful at all".  
>>> Unfortunately, the response is somewhere in the middle.  I guess the
>>> best course is to port it forward and test it on some x86 machines
>>> and see if it makes a big difference.
>>>
>>> Cheers,
>>> Jeff
>>>
>>
>> Yes, we need that.  Please commit =-)
>
>
> Thanks for the encouragement.  There are a few unresolved issues.  Most
> importantly, how are we presently dealing with config options that break
> modules?  If the option is to stay as it is, modules will have to be
> aware of the page size that is agreed upon by the rest of the kernel.
>
> There other issues are mostly considering ways to reduce the impact of
> the patch on the rest of the system.  How to tidy it up a bit more, if
> it can be.
>
>>
>> Scott
>>

PAE and MAC are two options that break the ABI for modules.  As long
as modules are compiled as part of the 'makekernel' target, they will
get the correct ABI.

Scott
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Large virtual page size support.

Julian Elischer
In reply to this post by Jeff Roberson
Jeff Roberson wrote:

>
> Thanks for the encouragement.  There are a few unresolved issues.  
> Most importantly, how are we presently dealing with config options
> that break modules?  If the option is to stay as it is, modules will
> have to be aware of the page size that is agreed upon by the rest of
> the kernel.


well we could as a matter of principal make the page size a variable and
not a constant.
we did the same with HZ which was always a #define.


>
> There other issues are mostly considering ways to reduce the impact of
> the patch on the rest of the system.  How to tidy it up a bit more, if
> it can be.
>
>>
>> Scott
>>
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"