Re: ZFS, NFS and Network tuning (Paul Patterson)

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: ZFS, NFS and Network tuning (Paul Patterson)

Michelle Li
...and the dmesg?
 
please post

[hidden email] wrote: Send freebsd-performance mailing list submissions to
 [hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
 http://lists.freebsd.org/mailman/listinfo/freebsd-performance
or, via email, send a message with subject or body 'help' to
 [hidden email]

You can reach the person managing the list at
 [hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of freebsd-performance digest..."


Today's Topics:

   1. Re: ZFS, NFS and Network tuning (Paul Patterson)
   2. Re: ZFS, NFS and Network tuning (Paul Patterson)
   3. Re: ZFS, NFS and Network tuning (Paul Patterson)
   4. intel i7 and Hyperthreading (Mike Tancsa)


----------------------------------------------------------------------

Message: 1
Date: Fri, 19 Dec 2008 06:47:59 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
 [hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     


------------------------------

Message: 2
Date: Fri, 19 Dec 2008 10:03:14 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
 [hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hello all,

I guess I've got to send this as I've already had about 5 responses claiming the same thing.  This is not a disk bottleneck.  The ZFS partition is capable of performing at the theoretical max of the drives.  The machine is performing at less than 5 MB combined.  I'm assuming that this is a problem with the NFSv3 throughput.  I just 'dd'  1000 1MB records (about 1GB) from the clients to their respective servers:

Client 1 to NetAPP:  3 tests for 45.9, 45.1, 46.1   Pretty consistent
Client 2 to FreeBSD/ZFS:  3 test for 29.7, 12.5, 19.1  NOT consistent  (also, the drives were lucky to hit 12% busy.

I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.)

I'll write after this.  However, if more people could review the configurations below and see if there's anything glaring....  However, the lack of consistency shows something is wrong network wise.

P.




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 9:47:59 AM
Subject: Re: ZFS, NFS and Network tuning


Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"


     


------------------------------

Message: 3
Date: Fri, 19 Dec 2008 10:59:54 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
 [hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hi,

Well, I got some input on things:

kern.ipc.somaxconn=32768
net.inet.tcp.mssdflt=1460

And for fstab

rw,tcp,intr,noatime,nfsv3,-w=65536,-r=65536

I tried turning on polling with ifconfig bce0 polling, however, I didn't see it in ifconfig bce0 so I don't believe it to be active or the card doesn't support it.

aI also removed async from the mounts.  These had a detrimental affect on the FreeBSD server.  I now get 64K per transfer (system -vm) but I'm still only getting about 4MB/sec on the disks and their utilization has dropped to about 5%.  Throughput from both clients is ~8.5MB/sec.  The tests were run separately.  The NetAPP on each host was over 48.5 MB/sec.

The FreeBSD host still has about 2 GB free.

Paul




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 1:03:14 PM
Subject: Re: ZFS, NFS and Network tuning

Hello all,

I guess I've got to send this as I've already had about 5 responses claiming the same thing.  This is not a disk bottleneck.  The ZFS partition is capable of performing at the theoretical max of the drives.  The machine is performing at less than 5 MB combined.  I'm assuming that this is a problem with the NFSv3 throughput.  I just 'dd'  1000 1MB records (about 1GB) from the clients to their respective servers:

Client 1 to NetAPP:  3 tests for 45.9, 45.1, 46.1   Pretty consistent
Client 2 to FreeBSD/ZFS:  3 test for 29.7, 12.5, 19.1  NOT consistent  (also, the drives were lucky to hit 12% busy.

I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.)

I'll write after this.  However, if more people could review the configurations below and see if there's anything glaring....  However, the lack of consistency shows something is wrong network wise.

P.




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 9:47:59 AM
Subject: Re: ZFS, NFS and Network tuning


Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     


------------------------------

Message: 4
Date: Fri, 19 Dec 2008 17:01:46 -0500
From: Mike Tancsa
Subject: intel i7 and Hyperthreading
To: [hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset="us-ascii"; format=flowed

Just got our first board to play around with and unlike in the past,
having hyperthreading enabled seems to help performance.... At least
in buildworld tests.

doing a make -j4 vs -j6 make -j8 vs -j10 gives

-j  buildworld time    % improvement over -j4
4       13:57
6       12:11            13%
8       11:32            18%
10      11:43            17%


dmesg below of the hardware... The CPU seems to run fairly cool, but
the board has a lot of nasty hot heatsinks

eg. running 8 burnP6 procs

0[ns3c]# sysctl -a | grep temperature
dev.cpu.0.temperature: 67
dev.cpu.1.temperature: 67
dev.cpu.2.temperature: 65
dev.cpu.3.temperature: 65
dev.cpu.4.temperature: 66
dev.cpu.5.temperature: 66
dev.cpu.6.temperature: 64
dev.cpu.7.temperature: 64
0[ns3c]#

vs idle

dev.cpu.0.temperature: 46
dev.cpu.1.temperature: 46
dev.cpu.2.temperature: 42
dev.cpu.3.temperature: 42
dev.cpu.4.temperature: 44
dev.cpu.5.temperature: 44
dev.cpu.6.temperature: 40
dev.cpu.7.temperature: 40

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-PRERELEASE #0: Fri Dec 19 19:48:15 EST 2008
     [hidden email]:/usr/obj/usr/src/sys/recycle
Timecounter "i8254" frequency 1193182 Hz quality 0

=== message truncated ===
















       
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: ZFS, NFS and Network tuning (Paul Patterson)

pathiaki2
Michelle,

just found this in the mail.  However, it looks like the VPN to work is down.  I can't get to the machine right now.  I'll mail on Monday.

P.




________________________________
From: Michelle Li <[hidden email]>
To: [hidden email]
Sent: Saturday, December 20, 2008 7:25:02 PM
Subject: Re: ZFS, NFS and Network tuning (Paul Patterson)

...and the dmesg?

please post

[hidden email] wrote: Send freebsd-performance mailing list submissions to
[hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
or, via email, send a message with subject or body 'help' to
[hidden email]

You can reach the person managing the list at
[hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of freebsd-performance digest..."


Today's Topics:

   1. Re: ZFS, NFS and Network tuning (Paul Patterson)
   2. Re: ZFS, NFS and Network tuning (Paul Patterson)
   3. Re: ZFS, NFS and Network tuning (Paul Patterson)
   4. intel i7 and Hyperthreading (Mike Tancsa)


----------------------------------------------------------------------

Message: 1
Date: Fri, 19 Dec 2008 06:47:59 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
[hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     


------------------------------

Message: 2
Date: Fri, 19 Dec 2008 10:03:14 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
[hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hello all,

I guess I've got to send this as I've already had about 5 responses claiming the same thing.  This is not a disk bottleneck.  The ZFS partition is capable of performing at the theoretical max of the drives.  The machine is performing at less than 5 MB combined.  I'm assuming that this is a problem with the NFSv3 throughput.  I just 'dd'  1000 1MB records (about 1GB) from the clients to their respective servers:

Client 1 to NetAPP:  3 tests for 45.9, 45.1, 46.1   Pretty consistent
Client 2 to FreeBSD/ZFS:  3 test for 29.7, 12.5, 19.1  NOT consistent  (also, the drives were lucky to hit 12% busy.

I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.)

I'll write after this.  However, if more people could review the configurations below and see if there's anything glaring....  However, the lack of consistency shows something is wrong network wise.

P.




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 9:47:59 AM
Subject: Re: ZFS, NFS and Network tuning


Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"


     


------------------------------

Message: 3
Date: Fri, 19 Dec 2008 10:59:54 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
[hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hi,

Well, I got some input on things:

kern.ipc.somaxconn=32768
net.inet.tcp.mssdflt=1460

And for fstab

rw,tcp,intr,noatime,nfsv3,-w=65536,-r=65536

I tried turning on polling with ifconfig bce0 polling, however, I didn't see it in ifconfig bce0 so I don't believe it to be active or the card doesn't support it.

aI also removed async from the mounts.  These had a detrimental affect on the FreeBSD server.  I now get 64K per transfer (system -vm) but I'm still only getting about 4MB/sec on the disks and their utilization has dropped to about 5%.  Throughput from both clients is ~8.5MB/sec.  The tests were run separately.  The NetAPP on each host was over 48.5 MB/sec.

The FreeBSD host still has about 2 GB free.

Paul




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 1:03:14 PM
Subject: Re: ZFS, NFS and Network tuning

Hello all,

I guess I've got to send this as I've already had about 5 responses claiming the same thing.  This is not a disk bottleneck.  The ZFS partition is capable of performing at the theoretical max of the drives.  The machine is performing at less than 5 MB combined.  I'm assuming that this is a problem with the NFSv3 throughput.  I just 'dd'  1000 1MB records (about 1GB) from the clients to their respective servers:

Client 1 to NetAPP:  3 tests for 45.9, 45.1, 46.1   Pretty consistent
Client 2 to FreeBSD/ZFS:  3 test for 29.7, 12.5, 19.1  NOT consistent  (also, the drives were lucky to hit 12% busy.

I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.)

I'll write after this.  However, if more people could review the configurations below and see if there's anything glaring....  However, the lack of consistency shows something is wrong network wise.

P.




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 9:47:59 AM
Subject: Re: ZFS, NFS and Network tuning


Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     


------------------------------

Message: 4
Date: Fri, 19 Dec 2008 17:01:46 -0500
From: Mike Tancsa
Subject: intel i7 and Hyperthreading
To: [hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset="us-ascii"; format=flowed

Just got our first board to play around with and unlike in the past,
having hyperthreading enabled seems to help performance.... At least
in buildworld tests.

doing a make -j4 vs -j6 make -j8 vs -j10 gives

-j  buildworld time    % improvement over -j4
4       13:57
6       12:11            13%
8       11:32            18%
10      11:43            17%


dmesg below of the hardware... The CPU seems to run fairly cool, but
the board has a lot of nasty hot heatsinks

eg. running 8 burnP6 procs

0[ns3c]# sysctl -a | grep temperature
dev.cpu.0.temperature: 67
dev.cpu.1.temperature: 67
dev.cpu.2.temperature: 65
dev.cpu.3.temperature: 65
dev.cpu.4.temperature: 66
dev.cpu.5.temperature: 66
dev.cpu.6.temperature: 64
dev.cpu.7.temperature: 64
0[ns3c]#

vs idle

dev.cpu.0.temperature: 46
dev.cpu.1.temperature: 46
dev.cpu.2.temperature: 42
dev.cpu.3.temperature: 42
dev.cpu.4.temperature: 44
dev.cpu.5.temperature: 44
dev.cpu.6.temperature: 40
dev.cpu.7.temperature: 40

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-PRERELEASE #0: Fri Dec 19 19:48:15 EST 2008
    [hidden email]:/usr/obj/usr/src/sys/recycle
Timecounter "i8254" frequency 1193182 Hz quality 0

=== message truncated ===
















     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: ZFS, NFS and Network tuning (Paul Patterson)

pathiaki2
In reply to this post by Michelle Li
Thanks to all.

I think this is the last post on this.  

James Chang (don't ever apologize for minor English issues when you're trying to help.  Anyone who would fault a person for that doesn't belong in the BSD community. :-)  Besides, you solved the problem.)

James suggested getting another card.  (I had already ordered one) It has an Intel chipset and comes up under the em driver.

I was getting 500+ Mb/sec consistently over two drives striped (SAS) with ZFS on the drives and running it from a Linux client over NFS.

I did a quick:

dd if=/dev/zero of=/mount/foo bs=64k count=10000  (/mount/foo is the FreeBSD server zfs drive on the linux client)

It transferred 6.6 GB in roughly 100 seconds at 66 MB/sec  ( 66 x 8 = 528 Mb )

Zero_copy_sockets was enabled.
Polling was enabled in the kernel.

Sadly, when I turned on polling on this card, it consistently ran 10-20 MB/sec SLOWER.

The good thing is that it looks like either the Broadcom chipset sucks or the bge driver sucks.

Either way, just popping the Intel card in and moving the cable and my throughput jumped by 500%.

Thank you everyone,

Paul




________________________________
From: Michelle Li <[hidden email]>
To: [hidden email]
Sent: Saturday, December 20, 2008 7:25:02 PM
Subject: Re: ZFS, NFS and Network tuning (Paul Patterson)

...and the dmesg?

please post

[hidden email] wrote: Send freebsd-performance mailing list submissions to
[hidden email]

To subscribe or unsubscribe via the World Wide Web, visit
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
or, via email, send a message with subject or body 'help' to
[hidden email]

You can reach the person managing the list at
[hidden email]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of freebsd-performance digest..."


Today's Topics:

   1. Re: ZFS, NFS and Network tuning (Paul Patterson)
   2. Re: ZFS, NFS and Network tuning (Paul Patterson)
   3. Re: ZFS, NFS and Network tuning (Paul Patterson)
   4. intel i7 and Hyperthreading (Mike Tancsa)


----------------------------------------------------------------------

Message: 1
Date: Fri, 19 Dec 2008 06:47:59 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
[hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     


------------------------------

Message: 2
Date: Fri, 19 Dec 2008 10:03:14 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
[hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hello all,

I guess I've got to send this as I've already had about 5 responses claiming the same thing.  This is not a disk bottleneck.  The ZFS partition is capable of performing at the theoretical max of the drives.  The machine is performing at less than 5 MB combined.  I'm assuming that this is a problem with the NFSv3 throughput.  I just 'dd'  1000 1MB records (about 1GB) from the clients to their respective servers:

Client 1 to NetAPP:  3 tests for 45.9, 45.1, 46.1   Pretty consistent
Client 2 to FreeBSD/ZFS:  3 test for 29.7, 12.5, 19.1  NOT consistent  (also, the drives were lucky to hit 12% busy.

I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.)

I'll write after this.  However, if more people could review the configurations below and see if there's anything glaring....  However, the lack of consistency shows something is wrong network wise.

P.




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 9:47:59 AM
Subject: Re: ZFS, NFS and Network tuning


Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"


     


------------------------------

Message: 3
Date: Fri, 19 Dec 2008 10:59:54 -0800 (PST)
From: Paul Patterson

Subject: Re: ZFS, NFS and Network tuning
To: Paul Patterson
,
[hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset=us-ascii

Hi,

Well, I got some input on things:

kern.ipc.somaxconn=32768
net.inet.tcp.mssdflt=1460

And for fstab

rw,tcp,intr,noatime,nfsv3,-w=65536,-r=65536

I tried turning on polling with ifconfig bce0 polling, however, I didn't see it in ifconfig bce0 so I don't believe it to be active or the card doesn't support it.

aI also removed async from the mounts.  These had a detrimental affect on the FreeBSD server.  I now get 64K per transfer (system -vm) but I'm still only getting about 4MB/sec on the disks and their utilization has dropped to about 5%.  Throughput from both clients is ~8.5MB/sec.  The tests were run separately.  The NetAPP on each host was over 48.5 MB/sec.

The FreeBSD host still has about 2 GB free.

Paul




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 1:03:14 PM
Subject: Re: ZFS, NFS and Network tuning

Hello all,

I guess I've got to send this as I've already had about 5 responses claiming the same thing.  This is not a disk bottleneck.  The ZFS partition is capable of performing at the theoretical max of the drives.  The machine is performing at less than 5 MB combined.  I'm assuming that this is a problem with the NFSv3 throughput.  I just 'dd'  1000 1MB records (about 1GB) from the clients to their respective servers:

Client 1 to NetAPP:  3 tests for 45.9, 45.1, 46.1   Pretty consistent
Client 2 to FreeBSD/ZFS:  3 test for 29.7, 12.5, 19.1  NOT consistent  (also, the drives were lucky to hit 12% busy.

I'm about to mount these servers to each client and see if there's a variation (although they are hw configured the same and bought the same time.)

I'll write after this.  However, if more people could review the configurations below and see if there's anything glaring....  However, the lack of consistency shows something is wrong network wise.

P.




________________________________
From: Paul Patterson

To: Paul Patterson
; [hidden email]
Sent: Friday, December 19, 2008 9:47:59 AM
Subject: Re: ZFS, NFS and Network tuning


Hi,

as promised, the parameter tuning I have on the box (does anyone see anything wrong?)

/boot/loader.conf

kern.hz="100"
vm.kmem_size_max="1536M"
vm.kmem_size="1536M"
vfs.zfs.prefetch_disble=1

/etc/sysctl.conf

kern.ipc.maxsockbuf=16777216
kern.ipc.nmbclusters=32768
kern.ipc.somaxconn=8192
kern.maxfiles=65536
kern.maxfilesperproc=32768
kern.mxvnodes=600000
net.inet.tcp.delayed_ack=0
net.inet.tcp.inflight.enable=0
net.inet.tcp.path_mtu_discovery=0
net.inet.tcp.recvbuf_auto=1
net.inet.tcp.recvbuf_inc=16384
net.inet.tcp.recvbuf_max=16777216
net.inet.tcp.recvspace=65536
net.inet.tcp.rfc1323=1
net.inet.tcp.sendbuf_auto=1
net.inet.tcpsendbuf_inc=8192
net.inet.tcp.sendspace=65536
net.inet.udp.maxdgram=57344
net.inet.udp.recvspace=65536
net.local.stream.recvspace=65536
net.inet.tcp.sendbuf_max=16777216





________________________________
From: Paul Patterson

To: [hidden email]
Sent: Thursday, December 18, 2008 8:04:37 PM
Subject: ZFS, NFS and Network tuning

Hi,

I just set up my first machine with ZFS.  (First, ZFS is nothing short of amazing)  I'm running FreeBSD 7.1-RC1 as an NFS server with ZFS striped across two volumes (just testing throughput for now.)  Anyhow, I was benching this box, 4GB or RAM, the volume is on 2x146 GB SAS 10K rpm drives and it's an HP Proliant DL360 with dual Gb interfaces. (device bce)

Now, I believe that I have tuned this box to the hilt with all the parameters that I can think of (it's at work right now so I'll cut and paste all the sysctls and loader.conf parameters for ZFS and networking) and it still seems to have some type of bottleneck.

I have two Debian Linux clients that I use to bench with.  I run a script that makes calls that writes to the NFS device and, after about 30 minutes, starts to delete the initial data and follow behind writing and deleting.

Here's what's happening:  The "other" machine is a NetAPP.  It's got 1GB of RAM and it's running RAID DP with 2 parity drives and 6 data drives, all SATA 750 GB 7200 RPM drives with dual Gb interfaces.  

The benchmark script manages to write lots of little (all less than 30KB) files at a rate of 11,000 per minute, however, after 30 minutes, when it starts deleting, the throughput on write goes to 9500 and deletion is 6000 per minute.  If I turn on the second node, I get 17,000 writing combined with about 11,000 deletions combined.  One way or another, this will overflow in time.  Not good.

Now, on to my pet project. :-)  The FreeBSD/ZFS server is only able to maintain about 3500 writes per minute but also deletes at the same rate!  (I would expect deletion to be at least as fast as writing)  The drives are running at only 20-35% while this is going on and only putting down about 4-5 MB/sec each.  So, at 1Gb or ~92MB/sec theoretical max (is that about right?) There's something wrong somewhere.  I'm assuming it's the network.  (I'll post all the tunings tomorrow.)

Thinking something wrong, I mounted only one client to each server (they are identical clients and the same configuration as the FreeBSD box).  I did a simple stream of:  dd if=/dev/zero of=/mnt/nfs bs=1m count=1000.  The FreeBSD box wins?!  It cranked up the drives to 45-50 MB/sec each and balanced them perfectly on transactions/sec KB/sec, etc from systat -vm. (Woohoo!)  The NetAPPs CPU was at over 35-40% constantly, (it does that while benching, too)

I'll post the NetAPP finding tomorrow as I forgot it for now.

As for the client mounting, it was with the options:  nfsvers=3,rsize=32768,wsize=32768,hard,intr,async,noatime

I'm trying to figure out why, when running this benchmark, can the NetAPP with WAFL nearly triple the FreeBSD/ZFS box.  

Also, I'm having something strange happen when I try to mount the disk from the FreeBSD server versus the NetAPP.  The FreeBSD server will sometimes RPC timeout.  Mounting the NetAPP is instantaneous.

That's the beginning.  If I have a list of things to check tomorrow, I will.  I'd like to see the little machine that could kick the NetAPPs butt.  (No offense to NetAPP. :-) )

Thank you for reading,

Paul


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"


     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     


------------------------------

Message: 4
Date: Fri, 19 Dec 2008 17:01:46 -0500
From: Mike Tancsa
Subject: intel i7 and Hyperthreading
To: [hidden email]
Message-ID: <[hidden email]>
Content-Type: text/plain; charset="us-ascii"; format=flowed

Just got our first board to play around with and unlike in the past,
having hyperthreading enabled seems to help performance.... At least
in buildworld tests.

doing a make -j4 vs -j6 make -j8 vs -j10 gives

-j  buildworld time    % improvement over -j4
4       13:57
6       12:11            13%
8       11:32            18%
10      11:43            17%


dmesg below of the hardware... The CPU seems to run fairly cool, but
the board has a lot of nasty hot heatsinks

eg. running 8 burnP6 procs

0[ns3c]# sysctl -a | grep temperature
dev.cpu.0.temperature: 67
dev.cpu.1.temperature: 67
dev.cpu.2.temperature: 65
dev.cpu.3.temperature: 65
dev.cpu.4.temperature: 66
dev.cpu.5.temperature: 66
dev.cpu.6.temperature: 64
dev.cpu.7.temperature: 64
0[ns3c]#

vs idle

dev.cpu.0.temperature: 46
dev.cpu.1.temperature: 46
dev.cpu.2.temperature: 42
dev.cpu.3.temperature: 42
dev.cpu.4.temperature: 44
dev.cpu.5.temperature: 44
dev.cpu.6.temperature: 40
dev.cpu.7.temperature: 40

Copyright (c) 1992-2008 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
         The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 7.1-PRERELEASE #0: Fri Dec 19 19:48:15 EST 2008
    [hidden email]:/usr/obj/usr/src/sys/recycle
Timecounter "i8254" frequency 1193182 Hz quality 0

=== message truncated ===
















     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"



     
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"