network performance

classic Classic list List threaded Threaded
30 messages Options
12
Reply | Threaded
Open this post in threaded view
|

network performance

Stefan Lambrev-2
Greetings,

I'm trying test a bridge firewall under FreeBSD 7.

What I have as configuration is:

Freebsd7 (web server) - bridge (FreeBSD7) - gigabit switch - flooders.

Both FreeBSD servers are using  FreeBSD 7.0-RC1 amd64
With netperf -l 60 -p 10303 -H 10.3.3.1 I have no problems to reach 116MB/s
with and without pf enabled.

But what I want to test is how well will perform the firewall during syn
floods.
For this I'm using hping3 (hping-devel in ports) to generate traffic
from flooders
to the web server.

First think, that I notice is, that hping running on linux generate
twice more traffic compared to freebsd.
So I plan to separate a server with dual bootable linux and fbsd and to
see what's the real difference.

Second problem that I encountered is, that when running hping from freebsd.
It exits after few seconds/minutes with this error message:
[send_ip] sendto: No buffer space available
And this happens on FreeBSD_7 and FreeBSD 6.2-p8 too amd64)

Can I increase those buffers ?

I'm able to generate 24MB/s SYN flood and during my test I can see this
on the bridge firewall:
netstat -w 1 -I em0 -d - external network
            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls drops
    427613  1757   25656852     233604     0   14016924     0     0
    428089  1274   25685358     233794     0   14025174     0     0
    427433  1167   25645998     234775     0   14088834     0     0
    438270  2300   26296218     233384     0   14004474     0     0
    438425  2009   26305518     233858     0   14034114     0     0

and from the internal network:
            input          (em1)           output
   packets  errs      bytes    packets  errs      bytes colls drops
    232912     0   13974838     425796     0   25549446     0  1334
    234487     0   14069338     423986     0   25432026     0  1631
    233951     0   14037178     431330     0   25880286     0  3888
    233509     0   14010658     436496     0   26191986     0  1437
    234181     0   14050978     430291     0   25816806     0  4001
    234144     0   14048870     430208     0   25810206     0  1621
    234176     0   14050678     430292     0   25828926     0  3001

And here is top -S

last pid: 21830;  load averages:  1.01,  0.50,  
0.72                                                                                        
up 3+04:59:43  20:27:49
84 processes:  7 running, 60 sleeping, 17 waiting
CPU states:  0.0% user,  0.0% nice, 38.2% system,  0.0% interrupt, 61.8%
idle
Mem: 17M Active, 159M Inact, 252M Wired, 120K Cache, 213M Buf, 1548M Free
Swap: 4056M Total, 4056M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
   14 root        1 171 ki31     0K    16K CPU0   0  76.8H 100.00% idle:
cpu0
   11 root        1 171 ki31     0K    16K RUN    3  76.0H 100.00% idle:
cpu3
   25 root        1 -68    -     0K    16K CPU1   1  54:26 86.28% em0 taskq
   26 root        1 -68    -     0K    16K CPU2   2  39:13 66.70% em1 taskq
   12 root        1 171 ki31     0K    16K RUN    2  76.0H 37.50% idle: cpu2
   13 root        1 171 ki31     0K    16K RUN    1  75.9H 16.89% idle: cpu1
   16 root        1 -32    -     0K    16K WAIT   0   7:00  0.00% swi4:
clock sio
   51 root        1  20    -     0K    16K syncer 3   4:30  0.00% syncer

vmstat -i
interrupt                          total       rate
irq1: atkbd0                         544          0
irq4: sio0                         10641          0
irq14: ata0                            1          0
irq19: uhci1+                     123697          0
cpu0: timer                    553887702       1997
irq256: em0                     48227501        173
irq257: em1                     46331164        167
cpu1: timer                    553887682       1997
cpu3: timer                    553887701       1997
cpu2: timer                    553887701       1997
Total                         2310244334       8333

netstat -m
594/2361/2955 mbufs in use (current/cache/total)
592/1854/2446/204800 mbuf clusters in use (current/cache/total/max)
592/1328 mbuf+clusters out of packet secondary zone in use (current/cache)
0/183/183/12800 4k (page size) jumbo clusters in use
(current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
1332K/5030K/6362K bytes allocated to network (current/cache/total)

systat -ifstat
      Interface           Traffic               Peak                Total
        bridge0  in     38.704 MB/s         38.704 MB/s          185.924 GB
                 out    38.058 MB/s         38.058 MB/s          189.855 GB

            em1  in     13.336 MB/s         13.402 MB/s           51.475 GB
                 out    24.722 MB/s         24.722 MB/s          137.396 GB

            em0  in     24.882 MB/s         24.882 MB/s          138.918 GB
                 out    13.336 MB/s         13.403 MB/s           45.886 GB

Both FreeBSD servers have quad port intel network card, 2GB memory
em0@pci0:3:0:0: class=0x020000 card=0x10bc8086 chip=0x10bc8086 rev=0x06
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = '82571EB Gigabit Ethernet Controller (Copper)'
    class      = network
    subclass   = ethernet

Firewall server is running on CPU: Intel(R) Xeon(R) X3220  @ 2.40GHz
(quad core)
Web server is running on Intel(R) Xeon(R) CPU 3070  @ 2.66GHz (dual core)

So in brief how can I get rid of "No buffer space available",
increase the sent rate of hping in FreeBSD and get rid of dropped
packets on rates like 24MB/s :)
What other tests can I run (switching on of cpu cores and etc)?
Anyone interested?

P.S. I'm using custom kernel, with SCHED_ULE, both freebsds build from
source with CPUTYPE?=core2
and net.inet.icmp.icmplim_output=0

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

After playing with many settings and testing various configuration, now
I'm able to to receive on bridge more then 800,000 packets/s
without errors, which is amazing!
Unfortunately the server behind bridge can't handle more then 250,000
packets/s
Please advise how I can increase those limits?
Is is possible?

The servers are with 82573E Gigabit Ethernet Controller (quad port)
So far I tried with lagg and ng_fec, but with them I see more problems
then benefits :)
Tried polling with kern.polling.user_frac from 5 to 95,
different HZ, but nothing helped.

Stefan Lambrev wrote:

> Greetings,
>
> I'm trying test a bridge firewall under FreeBSD 7.
>
> What I have as configuration is:
>
> Freebsd7 (web server) - bridge (FreeBSD7) - gigabit switch - flooders.
>
> Both FreeBSD servers are using  FreeBSD 7.0-RC1 amd64
> With netperf -l 60 -p 10303 -H 10.3.3.1 I have no problems to reach
> 116MB/s
> with and without pf enabled.
>
> But what I want to test is how well will perform the firewall during
> syn floods.
> For this I'm using hping3 (hping-devel in ports) to generate traffic
> from flooders
> to the web server.
>
> First think, that I notice is, that hping running on linux generate
> twice more traffic compared to freebsd.
> So I plan to separate a server with dual bootable linux and fbsd and
> to see what's the real difference.
>
> Second problem that I encountered is, that when running hping from
> freebsd.
> It exits after few seconds/minutes with this error message:
> [send_ip] sendto: No buffer space available
> And this happens on FreeBSD_7 and FreeBSD 6.2-p8 too amd64)
>
> Can I increase those buffers ?
>
> I'm able to generate 24MB/s SYN flood and during my test I can see
> this on the bridge firewall:
> netstat -w 1 -I em0 -d - external network
>            input          (em0)           output
>   packets  errs      bytes    packets  errs      bytes colls drops
>    427613  1757   25656852     233604     0   14016924     0     0
>    428089  1274   25685358     233794     0   14025174     0     0
>    427433  1167   25645998     234775     0   14088834     0     0
>    438270  2300   26296218     233384     0   14004474     0     0
>    438425  2009   26305518     233858     0   14034114     0     0
>
> and from the internal network:
>            input          (em1)           output
>   packets  errs      bytes    packets  errs      bytes colls drops
>    232912     0   13974838     425796     0   25549446     0  1334
>    234487     0   14069338     423986     0   25432026     0  1631
>    233951     0   14037178     431330     0   25880286     0  3888
>    233509     0   14010658     436496     0   26191986     0  1437
>    234181     0   14050978     430291     0   25816806     0  4001
>    234144     0   14048870     430208     0   25810206     0  1621
>    234176     0   14050678     430292     0   25828926     0  3001
>
> And here is top -S
>
> last pid: 21830;  load averages:  1.01,  0.50,  
> 0.72                                                                                        
> up 3+04:59:43  20:27:49
> 84 processes:  7 running, 60 sleeping, 17 waiting
> CPU states:  0.0% user,  0.0% nice, 38.2% system,  0.0% interrupt,
> 61.8% idle
> Mem: 17M Active, 159M Inact, 252M Wired, 120K Cache, 213M Buf, 1548M Free
> Swap: 4056M Total, 4056M Free
>
>  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
>   14 root        1 171 ki31     0K    16K CPU0   0  76.8H 100.00%
> idle: cpu0
>   11 root        1 171 ki31     0K    16K RUN    3  76.0H 100.00%
> idle: cpu3
>   25 root        1 -68    -     0K    16K CPU1   1  54:26 86.28% em0
> taskq
>   26 root        1 -68    -     0K    16K CPU2   2  39:13 66.70% em1
> taskq
>   12 root        1 171 ki31     0K    16K RUN    2  76.0H 37.50% idle:
> cpu2
>   13 root        1 171 ki31     0K    16K RUN    1  75.9H 16.89% idle:
> cpu1
>   16 root        1 -32    -     0K    16K WAIT   0   7:00  0.00% swi4:
> clock sio
>   51 root        1  20    -     0K    16K syncer 3   4:30  0.00% syncer
>
> vmstat -i
> interrupt                          total       rate
> irq1: atkbd0                         544          0
> irq4: sio0                         10641          0
> irq14: ata0                            1          0
> irq19: uhci1+                     123697          0
> cpu0: timer                    553887702       1997
> irq256: em0                     48227501        173
> irq257: em1                     46331164        167
> cpu1: timer                    553887682       1997
> cpu3: timer                    553887701       1997
> cpu2: timer                    553887701       1997
> Total                         2310244334       8333
>
> netstat -m
> 594/2361/2955 mbufs in use (current/cache/total)
> 592/1854/2446/204800 mbuf clusters in use (current/cache/total/max)
> 592/1328 mbuf+clusters out of packet secondary zone in use
> (current/cache)
> 0/183/183/12800 4k (page size) jumbo clusters in use
> (current/cache/total/max)
> 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
> 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
> 1332K/5030K/6362K bytes allocated to network (current/cache/total)
>
> systat -ifstat
>      Interface           Traffic               Peak                Total
>        bridge0  in     38.704 MB/s         38.704 MB/s          
> 185.924 GB
>                 out    38.058 MB/s         38.058 MB/s          
> 189.855 GB
>
>            em1  in     13.336 MB/s         13.402 MB/s          
> 51.475 GB
>                 out    24.722 MB/s         24.722 MB/s          
> 137.396 GB
>
>            em0  in     24.882 MB/s         24.882 MB/s          
> 138.918 GB
>                 out    13.336 MB/s         13.403 MB/s          
> 45.886 GB
>
> Both FreeBSD servers have quad port intel network card, 2GB memory
> em0@pci0:3:0:0: class=0x020000 card=0x10bc8086 chip=0x10bc8086
> rev=0x06 hdr=0x00
>    vendor     = 'Intel Corporation'
>    device     = '82571EB Gigabit Ethernet Controller (Copper)'
>    class      = network
>    subclass   = ethernet
>
> Firewall server is running on CPU: Intel(R) Xeon(R) X3220  @ 2.40GHz
> (quad core)
> Web server is running on Intel(R) Xeon(R) CPU 3070  @ 2.66GHz (dual core)
>
> So in brief how can I get rid of "No buffer space available",
> increase the sent rate of hping in FreeBSD and get rid of dropped
> packets on rates like 24MB/s :)
> What other tests can I run (switching on of cpu cores and etc)?
> Anyone interested?
>
> P.S. I'm using custom kernel, with SCHED_ULE, both freebsds build from
> source with CPUTYPE?=core2
> and net.inet.icmp.icmplim_output=0
>

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

George Neville-Neil-3
At Wed, 30 Jan 2008 19:13:07 +0200,
Stefan Lambrev wrote:

>
> Greetings,
>
> After playing with many settings and testing various configuration, now
> I'm able to to receive on bridge more then 800,000 packets/s
> without errors, which is amazing!
> Unfortunately the server behind bridge can't handle more then 250,000
> packets/s
> Please advise how I can increase those limits?
> Is is possible?
>
> The servers are with 82573E Gigabit Ethernet Controller (quad port)
> So far I tried with lagg and ng_fec, but with them I see more problems
> then benefits :)
> Tried polling with kern.polling.user_frac from 5 to 95,
> different HZ, but nothing helped.

Increase the size of your socket buffers.

Increase the amount of mbufs in the system.

Best,
George
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

[hidden email] wrote:

> At Wed, 30 Jan 2008 19:13:07 +0200,
> Stefan Lambrev wrote:
>  
>> Greetings,
>>
>> After playing with many settings and testing various configuration, now
>> I'm able to to receive on bridge more then 800,000 packets/s
>> without errors, which is amazing!
>> Unfortunately the server behind bridge can't handle more then 250,000
>> packets/s
>> Please advise how I can increase those limits?
>> Is is possible?
>>
>> The servers are with 82573E Gigabit Ethernet Controller (quad port)
>> So far I tried with lagg and ng_fec, but with them I see more problems
>> then benefits :)
>> Tried polling with kern.polling.user_frac from 5 to 95,
>> different HZ, but nothing helped.
>>    
>
> Increase the size of your socket buffers.
>
> Increase the amount of mbufs in the system.
>
> Best,
> George
>  
Here is what I put in my sysctl.conf:

kern.random.sys.harvest.ethernet=0
kern.ipc.nmbclusters=262144
kern.ipc.maxsockbuf=2097152
kern.ipc.maxsockets=98624
kern.ipc.somaxconn=1024

and in /boot/loader.conf:
vm.kmem_size="1024M"
kern.hz="500"

this is from netstat -m
516/774/1290 mbufs in use (current/cache/total)
513/411/924/262144 mbuf clusters in use (current/cache/total/max)
513/383 mbuf+clusters out of packet secondary zone in use (current/cache)
0/2/2/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
1155K/1023K/2178K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/0/0 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

But still  netstat -w1 -I em0 shows:

            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    273877 113313   16432620     254270     0   14746500     0
    273397 109905   16403820     253946     0   14728810     0
    273945 113337   16436700     254285     0   14750560     0

What bothers me is the output of top -S:

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
   22 root        1 -68    -     0K    16K CPU1   1  12:11 100.00% em0 taskq
   11 root        1 171 ki31     0K    16K RUN    0  21:56 99.17% idle: cpu0
   10 root        1 171 ki31     0K    16K RUN    1   9:16  0.00% idle: cpu1
   14 root        1 -44    -     0K    16K WAIT   0   0:07  0.00% swi1: net

and vmstat:

 procs      memory      page                   disk   faults      cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad4   in   sy   cs us
sy id
 1 0 0   67088 1939700     0   0   0   0     0   0   0 2759  119 1325  0
50 50
 0 0 0   67088 1939700     0   0   0   0     0   0   0 2760  127 1178  0
50 50
 0 0 0   67088 1939700     0   0   0   0     0   0   0 2761  120 1269  0
50 50

What I'm missing?

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

In my desire to increase network throughput, and to be able to handle
more then ~250-270kpps
I started experimenting with lagg and link aggregation control protocol
(lacp).
To my surprise this doesn't increase the amount of packets my server can
handle

Here is what netstat reports:

netstat -w1 -I lagg0
            input        (lagg0)           output
   packets  errs      bytes    packets  errs      bytes colls
    267180     0   16030806     254056     0   14735542     0
    266875     0   16012506     253829     0   14722260     0

netstat -w1 -I em0
            input          (em0)           output
   packets  errs      bytes    packets  errs      bytes colls
    124789 72976    7487340     115329     0    6690468     0
    126860 67350    7611600     114769     0    6658002     0

netstat -w1 -I em2
            input          (em2)           output
   packets  errs      bytes    packets  errs      bytes colls
    123695 65533    7421700     113575     0    6584856     0
    130277 62646    7816626     113648     0    6592280     0
    123545 64171    7412706     113714     0    6596174     0

Using lagg doesn't improve situation at all, and also errors are not
reported.
Also using lagg increased content switches:

 procs      memory      page                   disk   faults      cpu
 r b w     avm    fre   flt  re  pi  po    fr  sr ad4   in   sy   cs us
sy id
 1 0 0   81048 1914640    52   0   0   0    50   0   0 3036 37902 13512  
1 20 79
 0 0 0   81048 1914640    13   0   0   0     0   0   0 9582   83 22166  
0 56 44
 0 0 0   81048 1914640    13   0   0   0     0   0   0 9594   80 22028  
0 55 45
 0 0 0   81048 1914640    13   0   0   0     0   0   0 9593   82 22095  
0 56 44

Top showed for CPU states +55%   system, which is quite high?

I'll use hwpmc and lock_profiling to see where the kernel spends it's time.

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

Stefan Lambrev wrote:

> Greetings,
>
> In my desire to increase network throughput, and to be able to handle
> more then ~250-270kpps
> I started experimenting with lagg and link aggregation control
> protocol (lacp).
> To my surprise this doesn't increase the amount of packets my server
> can handle
>
> Here is what netstat reports:
>
> netstat -w1 -I lagg0
>            input        (lagg0)           output
>   packets  errs      bytes    packets  errs      bytes colls
>    267180     0   16030806     254056     0   14735542     0
>    266875     0   16012506     253829     0   14722260     0
>
> netstat -w1 -I em0
>            input          (em0)           output
>   packets  errs      bytes    packets  errs      bytes colls
>    124789 72976    7487340     115329     0    6690468     0
>    126860 67350    7611600     114769     0    6658002     0
>
> netstat -w1 -I em2
>            input          (em2)           output
>   packets  errs      bytes    packets  errs      bytes colls
>    123695 65533    7421700     113575     0    6584856     0
>    130277 62646    7816626     113648     0    6592280     0
>    123545 64171    7412706     113714     0    6596174     0
>
> Using lagg doesn't improve situation at all, and also errors are not
> reported.
> Also using lagg increased content switches:
>
> procs      memory      page                   disk   faults      cpu
> r b w     avm    fre   flt  re  pi  po    fr  sr ad4   in   sy   cs us
> sy id
> 1 0 0   81048 1914640    52   0   0   0    50   0   0 3036 37902
> 13512  1 20 79
> 0 0 0   81048 1914640    13   0   0   0     0   0   0 9582   83 22166  
> 0 56 44
> 0 0 0   81048 1914640    13   0   0   0     0   0   0 9594   80 22028  
> 0 55 45
> 0 0 0   81048 1914640    13   0   0   0     0   0   0 9593   82 22095  
> 0 56 44
>
> Top showed for CPU states +55%   system, which is quite high?
>
> I'll use hwpmc and lock_profiling to see where the kernel spends it's
> time.
>
Greetings,

Here is what hwpmc shows (without using lagg):

  %   cumulative   self              self     total
 time   seconds   seconds    calls  ms/call  ms/call  name
 14.7  325801.00 325801.00        0  100.00%           MD5Transform [1]
  8.4  512008.00 186207.00        0  100.00%           _mtx_unlock_flags [2]
  6.1  646787.00 134779.00        0  100.00%           _mtx_lock_flags [3]
  5.6  769909.00 123122.00        0  100.00%           uma_zalloc_arg [4]
  5.0  879853.00 109944.00        0  100.00%           rn_match [5]
  3.5  957294.00 77441.00        0  100.00%           memcpy [6]
  3.1 1025989.00 68695.00        0  100.00%           bzero [7]
  2.8 1087273.00 61284.00        0  100.00%           em_encap [8]
  2.6 1145231.00 57958.00        0  100.00%           ip_output [9]
  2.5 1200105.00 54874.00        0  100.00%          
bus_dmamap_load_mbuf_sg [10]
  2.3 1251626.00 51521.00        0  100.00%           syncache_add [11]
  2.1 1297826.50 46200.50        0  100.00%           syncache_lookup [12]
  2.1 1343661.50 45835.00        0  100.00%           tcp_input [13]
  1.8 1383912.00 40250.50        0  100.00%           ip_input [14]
  1.5 1417997.00 34085.00        0  100.00%           syncache_respond [15]
  1.5 1451114.50 33117.50        0  100.00%           uma_zfree_internal
[16]
  1.5 1484046.00 32931.50        0  100.00%           critical_exit [17]
  1.5 1516899.00 32853.00        0  100.00%           MD5Update [18]

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=19b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4>
        ether 00:15:17:58:11:a5
        inet 10.3.3.1 netmask 0xffffff00 broadcast 10.3.3.255
        media: Ethernet autoselect (1000baseTX <full-duplex>)
        status: active

Is it normal so much time to be spent in MD5Transform with tx/rx enabled?

LOCK_PROFILING results here - http://89.186.204.158/lock_profiling2.txt

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Andrew Thompson-2
In reply to this post by Stefan Lambrev-2
On Mon, Feb 04, 2008 at 05:26:35PM +0200, Stefan Lambrev wrote:

> Greetings,
>
> In my desire to increase network throughput, and to be able to handle more
> then ~250-270kpps
> I started experimenting with lagg and link aggregation control protocol
> (lacp).
> To my surprise this doesn't increase the amount of packets my server can
> handle
>
> Using lagg doesn't improve situation at all, and also errors are not
> reported.
> Also using lagg increased content switches:
>
> Top showed for CPU states +55%   system, which is quite high?
>
> I'll use hwpmc and lock_profiling to see where the kernel spends it's time.

Thanks for investigating this. One thing to note is that ip flows from
the same connection always go down the same interface, this is because
Ethernet is not allowed to reorder frames. The hash uses
src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure when
performance testing that your traffic varies in these values. Adding
tcp/udp ports to the hashing may help.

I look forward to your profiling results.


cheers,
Andrew
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Andrew Thompson wrote:

> On Mon, Feb 04, 2008 at 05:26:35PM +0200, Stefan Lambrev wrote:
>  
>> Greetings,
>>
>> In my desire to increase network throughput, and to be able to handle more
>> then ~250-270kpps
>> I started experimenting with lagg and link aggregation control protocol
>> (lacp).
>> To my surprise this doesn't increase the amount of packets my server can
>> handle
>>
>> Using lagg doesn't improve situation at all, and also errors are not
>> reported.
>> Also using lagg increased content switches:
>>
>> Top showed for CPU states +55%   system, which is quite high?
>>
>> I'll use hwpmc and lock_profiling to see where the kernel spends it's time.
>>    
>
> Thanks for investigating this. One thing to note is that ip flows from
> the same connection always go down the same interface, this is because
> Ethernet is not allowed to reorder frames. The hash uses
> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure when
> performance testing that your traffic varies in these values. Adding
> tcp/udp ports to the hashing may help.
>  
The traffic, that I generate is with random/spoofed src part, so it is
split between interfaces for sure :)

Here you can find results when under load from hwpmc and lock_profiling:
http://89.186.204.158/lock_profiling-lagg.txt
http://89.186.204.158/lagg-gprof.txt


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Stefan Lambrev wrote:

> Andrew Thompson wrote:
>> On Mon, Feb 04, 2008 at 05:26:35PM +0200, Stefan Lambrev wrote:
>>  
>>> Greetings,
>>>
>>> In my desire to increase network throughput, and to be able to
>>> handle more then ~250-270kpps
>>> I started experimenting with lagg and link aggregation control
>>> protocol (lacp).
>>> To my surprise this doesn't increase the amount of packets my server
>>> can handle
>>>
>>> Using lagg doesn't improve situation at all, and also errors are not
>>> reported.
>>> Also using lagg increased content switches:
>>>
>>> Top showed for CPU states +55%   system, which is quite high?
>>>
>>> I'll use hwpmc and lock_profiling to see where the kernel spends
>>> it's time.
>>>    
>>
>> Thanks for investigating this. One thing to note is that ip flows from
>> the same connection always go down the same interface, this is because
>> Ethernet is not allowed to reorder frames. The hash uses
>> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure when
>> performance testing that your traffic varies in these values. Adding
>> tcp/udp ports to the hashing may help.
>>  
> The traffic, that I generate is with random/spoofed src part, so it is
> split between interfaces for sure :)
>
> Here you can find results when under load from hwpmc and lock_profiling:
> http://89.186.204.158/lock_profiling-lagg.txt
> http://89.186.204.158/lagg-gprof.txt
>
http://89.186.204.158/lagg2-gprof.txt I forget this file :)
>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "[hidden email]"

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

Stefan Lambrev wrote:

> Stefan Lambrev wrote:
>> Andrew Thompson wrote:
>>> On Mon, Feb 04, 2008 at 05:26:35PM +0200, Stefan Lambrev wrote:
>>>  
>>>> Greetings,
>>>>
>>>> In my desire to increase network throughput, and to be able to
>>>> handle more then ~250-270kpps
>>>> I started experimenting with lagg and link aggregation control
>>>> protocol (lacp).
>>>> To my surprise this doesn't increase the amount of packets my
>>>> server can handle
>>>>
>>>> Using lagg doesn't improve situation at all, and also errors are
>>>> not reported.
>>>> Also using lagg increased content switches:
>>>>
>>>> Top showed for CPU states +55%   system, which is quite high?
>>>>
>>>> I'll use hwpmc and lock_profiling to see where the kernel spends
>>>> it's time.
>>>>    
>>>
>>> Thanks for investigating this. One thing to note is that ip flows from
>>> the same connection always go down the same interface, this is because
>>> Ethernet is not allowed to reorder frames. The hash uses
>>> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure when
>>> performance testing that your traffic varies in these values. Adding
>>> tcp/udp ports to the hashing may help.
>>>  
>> The traffic, that I generate is with random/spoofed src part, so it
>> is split between interfaces for sure :)
>>
>> Here you can find results when under load from hwpmc and lock_profiling:
>> http://89.186.204.158/lock_profiling-lagg.txt
>> http://89.186.204.158/lagg-gprof.txt
>>
> http://89.186.204.158/lagg2-gprof.txt I forget this file :)
>
I found that MD5Transform aways uses ~14% (with rx/txcsum enabled or
disabled).
And when using without lagg MD5Transform pick up to 20% of the time.
Is this normal?

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Kris Kennaway-3
Stefan Lambrev wrote:

>>>> Thanks for investigating this. One thing to note is that ip flows from
>>>> the same connection always go down the same interface, this is because
>>>> Ethernet is not allowed to reorder frames. The hash uses
>>>> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure when
>>>> performance testing that your traffic varies in these values. Adding
>>>> tcp/udp ports to the hashing may help.
>>>>  
>>> The traffic, that I generate is with random/spoofed src part, so it
>>> is split between interfaces for sure :)
>>>
>>> Here you can find results when under load from hwpmc and lock_profiling:
>>> http://89.186.204.158/lock_profiling-lagg.txt

OK, this shows the following major problems:

     39     22375065      1500649     5690741     3     0       119007
      712359 /usr/src/sys/net/route.c:147 (sleep mutex:radix node head)
     21      3012732      1905704     1896914     1     1        14102
      496427 /usr/src/sys/netinet/ip_output.c:594 (sleep mutex:rtentry)
     22          120      2073128          47     2 44109            0
           3
/usr/src/sys/modules/if_lagg/../../net/ieee8023ad_lacp.c:503 (rw:if_lagg
rwlock)
     39     17857439      4262576     5690740     3     0        95072
     1484738 /usr/src/sys/net/route.c:197 (sleep mutex:rtentry)

It looks like the if_lagg one has been fixed already in 8.0, it could
probably be backported but requires some other infrastructure that might
not be in 7.0.

The others are to do with concurrent transmission of packets (it is
doing silly things with route lookups).  kmacy has a WIP that fixes
this.  If you are interested in testing an 8.0 kernel with the fixes let
me know.

>>> http://89.186.204.158/lagg-gprof.txt
>>>
>> http://89.186.204.158/lagg2-gprof.txt I forget this file :)
>>
> I found that MD5Transform aways uses ~14% (with rx/txcsum enabled or
> disabled).

Yeah, these don't have anything to do with MD5.

> And when using without lagg MD5Transform pick up to 20% of the time.
> Is this normal?

It is probably from the syncache.  You could disable it
(net.inet.tcp.syncookies_only) if you don't need strong protection
against SYN flooding.

Kris
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Hello,

Kris Kennaway wrote:

> Stefan Lambrev wrote:
>
>>>>> Thanks for investigating this. One thing to note is that ip flows
>>>>> from
>>>>> the same connection always go down the same interface, this is
>>>>> because
>>>>> Ethernet is not allowed to reorder frames. The hash uses
>>>>> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure
>>>>> when
>>>>> performance testing that your traffic varies in these values. Adding
>>>>> tcp/udp ports to the hashing may help.
>>>>>  
>>>> The traffic, that I generate is with random/spoofed src part, so it
>>>> is split between interfaces for sure :)
>>>>
>>>> Here you can find results when under load from hwpmc and
>>>> lock_profiling:
>>>> http://89.186.204.158/lock_profiling-lagg.txt
>
> OK, this shows the following major problems:
>
>     39     22375065      1500649     5690741     3     0       119007
>      712359 /usr/src/sys/net/route.c:147 (sleep mutex:radix node head)
>     21      3012732      1905704     1896914     1     1        14102
>      496427 /usr/src/sys/netinet/ip_output.c:594 (sleep mutex:rtentry)
>     22          120      2073128          47     2 44109            0
>           3
> /usr/src/sys/modules/if_lagg/../../net/ieee8023ad_lacp.c:503
> (rw:if_lagg rwlock)
>     39     17857439      4262576     5690740     3     0        95072
>     1484738 /usr/src/sys/net/route.c:197 (sleep mutex:rtentry)
>
> It looks like the if_lagg one has been fixed already in 8.0, it could
> probably be backported but requires some other infrastructure that
> might not be in 7.0.
>
> The others are to do with concurrent transmission of packets (it is
> doing silly things with route lookups).  kmacy has a WIP that fixes
> this.  If you are interested in testing an 8.0 kernel with the fixes
> let me know.
Well those servers are only for tests so I can test everything, but at
some point I'll have to make final decision what to use in production :)
>
>>>> http://89.186.204.158/lagg-gprof.txt
>>>>
>>> http://89.186.204.158/lagg2-gprof.txt I forget this file :)
>>>
>> I found that MD5Transform aways uses ~14% (with rx/txcsum enabled or
>> disabled).
>
> Yeah, these don't have anything to do with MD5.
Well I didn't find from where MD5Transform() is called, so I guess it's
a some 'magic', that I still do not understand ;)
>
>> And when using without lagg MD5Transform pick up to 20% of the time.
>> Is this normal?
>
> It is probably from the syncache.  You could disable it
> (net.inet.tcp.syncookies_only) if you don't need strong protection
> against SYN flooding.
>
> Kris
How the server perform during SYN flooding is exactly what I test at the
moment :)
So I can't disable this.

Just for information, if someone is interested - I looked how linux
(2.6.22-14-generic ubuntu) perform in the same situation .. by default
it doesn't perform at all - it hardly replays to 100-200 packets/s,
with syncookies enabled it can handle up to 70-90,000 pps (250-270,000
compared to freebsd), but the server is very loaded and not very
responsible.
Of course this doesn't mean that FreeBSD can't perform better ;)

I plan to test iptables, newer kernel, various options, and may be few
others distros.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Kris Kennaway-3
Stefan Lambrev wrote:

> Hello,
>
> Kris Kennaway wrote:
>> Stefan Lambrev wrote:
>>
>>>>>> Thanks for investigating this. One thing to note is that ip flows
>>>>>> from
>>>>>> the same connection always go down the same interface, this is
>>>>>> because
>>>>>> Ethernet is not allowed to reorder frames. The hash uses
>>>>>> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make sure
>>>>>> when
>>>>>> performance testing that your traffic varies in these values. Adding
>>>>>> tcp/udp ports to the hashing may help.
>>>>>>  
>>>>> The traffic, that I generate is with random/spoofed src part, so it
>>>>> is split between interfaces for sure :)
>>>>>
>>>>> Here you can find results when under load from hwpmc and
>>>>> lock_profiling:
>>>>> http://89.186.204.158/lock_profiling-lagg.txt
>>
>> OK, this shows the following major problems:
>>
>>     39     22375065      1500649     5690741     3     0       119007
>>      712359 /usr/src/sys/net/route.c:147 (sleep mutex:radix node head)
>>     21      3012732      1905704     1896914     1     1        14102
>>      496427 /usr/src/sys/netinet/ip_output.c:594 (sleep mutex:rtentry)
>>     22          120      2073128          47     2 44109            0
>>           3
>> /usr/src/sys/modules/if_lagg/../../net/ieee8023ad_lacp.c:503
>> (rw:if_lagg rwlock)
>>     39     17857439      4262576     5690740     3     0        95072
>>     1484738 /usr/src/sys/net/route.c:197 (sleep mutex:rtentry)
>>
>> It looks like the if_lagg one has been fixed already in 8.0, it could
>> probably be backported but requires some other infrastructure that
>> might not be in 7.0.
>>
>> The others are to do with concurrent transmission of packets (it is
>> doing silly things with route lookups).  kmacy has a WIP that fixes
>> this.  If you are interested in testing an 8.0 kernel with the fixes
>> let me know.
> Well those servers are only for tests so I can test everything, but at
> some point I'll have to make final decision what to use in production :)

http://www.freebsd.org/~kris/p4-net.tbz is a sys/ tarball from my p4
branch, which includes these and other optimizations.

>>>>> http://89.186.204.158/lagg-gprof.txt
>>>>>
>>>> http://89.186.204.158/lagg2-gprof.txt I forget this file :)
>>>>
>>> I found that MD5Transform aways uses ~14% (with rx/txcsum enabled or
>>> disabled).
>>
>> Yeah, these don't have anything to do with MD5.
> Well I didn't find from where MD5Transform() is called, so I guess it's
> a some 'magic', that I still do not understand ;)

MD5Transform is an internal function called by other MD5* functions.
Check netinet/tcp_syncache.c

>> It is probably from the syncache.  You could disable it
>> (net.inet.tcp.syncookies_only) if you don't need strong protection
>> against SYN flooding.
>>
>> Kris
> How the server perform during SYN flooding is exactly what I test at the
> moment :)
> So I can't disable this.

I thought this trace was on the machine you are transmitting the SYNs
from, perhaps I misunderstood.

> Just for information, if someone is interested - I looked how linux
> (2.6.22-14-generic ubuntu) perform in the same situation .. by default
> it doesn't perform at all - it hardly replays to 100-200 packets/s,
> with syncookies enabled it can handle up to 70-90,000 pps (250-270,000
> compared to freebsd), but the server is very loaded and not very
> responsible.
> Of course this doesn't mean that FreeBSD can't perform better ;)

What do you mean "compared to freebsd"?

Kris

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

Kris Kennaway wrote:

> Stefan Lambrev wrote:
>> Hello,
>>
>> Kris Kennaway wrote:
>>> Stefan Lambrev wrote:
>>>
>>>>>>> Thanks for investigating this. One thing to note is that ip
>>>>>>> flows from
>>>>>>> the same connection always go down the same interface, this is
>>>>>>> because
>>>>>>> Ethernet is not allowed to reorder frames. The hash uses
>>>>>>> src-mac, dst-mac, src-ip and dst-ip (see lagg_hashmbuf), make
>>>>>>> sure when
>>>>>>> performance testing that your traffic varies in these values.
>>>>>>> Adding
>>>>>>> tcp/udp ports to the hashing may help.
>>>>>>>  
>>>>>> The traffic, that I generate is with random/spoofed src part, so
>>>>>> it is split between interfaces for sure :)
>>>>>>
>>>>>> Here you can find results when under load from hwpmc and
>>>>>> lock_profiling:
>>>>>> http://89.186.204.158/lock_profiling-lagg.txt
>>>
>>> OK, this shows the following major problems:
>>>
>>>     39     22375065      1500649     5690741     3     0      
>>> 119007      712359 /usr/src/sys/net/route.c:147 (sleep mutex:radix
>>> node head)
>>>     21      3012732      1905704     1896914     1     1        
>>> 14102      496427 /usr/src/sys/netinet/ip_output.c:594 (sleep
>>> mutex:rtentry)
>>>     22          120      2073128          47     2 44109            
>>> 0           3
>>> /usr/src/sys/modules/if_lagg/../../net/ieee8023ad_lacp.c:503
>>> (rw:if_lagg rwlock)
>>>     39     17857439      4262576     5690740     3     0        
>>> 95072     1484738 /usr/src/sys/net/route.c:197 (sleep mutex:rtentry)
>>>
>>> It looks like the if_lagg one has been fixed already in 8.0, it
>>> could probably be backported but requires some other infrastructure
>>> that might not be in 7.0.
>>>
>>> The others are to do with concurrent transmission of packets (it is
>>> doing silly things with route lookups).  kmacy has a WIP that fixes
>>> this.  If you are interested in testing an 8.0 kernel with the fixes
>>> let me know.
>> Well those servers are only for tests so I can test everything, but
>> at some point I'll have to make final decision what to use in
>> production :)
>
> http://www.freebsd.org/~kris/p4-net.tbz is a sys/ tarball from my p4
> branch, which includes these and other optimizations.
Just downloaded them - will patch my system and test today.

>
>>>>>> http://89.186.204.158/lagg-gprof.txt
>>>>>>
>>>>> http://89.186.204.158/lagg2-gprof.txt I forget this file :)
>>>>>
>>>> I found that MD5Transform aways uses ~14% (with rx/txcsum enabled
>>>> or disabled).
>>>
>>> Yeah, these don't have anything to do with MD5.
>> Well I didn't find from where MD5Transform() is called, so I guess
>> it's a some 'magic', that I still do not understand ;)
>
> MD5Transform is an internal function called by other MD5* functions.
> Check netinet/tcp_syncache.c
Well now I understand why I see the only on the final delivery host and
not on the firewall :)

>
>>> It is probably from the syncache.  You could disable it
>>> (net.inet.tcp.syncookies_only) if you don't need strong protection
>>> against SYN flooding.
>>>
>>> Kris
>> How the server perform during SYN flooding is exactly what I test at
>> the moment :)
>> So I can't disable this.
>
> I thought this trace was on the machine you are transmitting the SYNs
> from, perhaps I misunderstood.
The first traces when we discussed hping was from the machine that is
transmitting the SYNs.
Now I'm on the next step where I'm trying to survive the SYN flood.
That's why lagg + lacp sounds intriguing for me,
because em driver is not really SMPable, but if I the traffic is split
between two or more network cards, then I'll be able to utilize two or
more CPUs.

>
>> Just for information, if someone is interested - I looked how linux
>> (2.6.22-14-generic ubuntu) perform in the same situation .. by
>> default it doesn't perform at all - it hardly replays to 100-200
>> packets/s,
>> with syncookies enabled it can handle up to 70-90,000 pps
>> (250-270,000 compared to freebsd), but the server is very loaded and
>> not very responsible.
>> Of course this doesn't mean that FreeBSD can't perform better ;)
>
> What do you mean "compared to freebsd"?
I mean that the same hardware when running Linux is able to survive when
bombed with 70-90kpps, and when running FreeBSD it can survive 250-270kpps
Of course I'm using some default values for this linux distro, so to
make the comparison fair, I'll try to tune and linux too.

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
In reply to this post by Kris Kennaway-3
Greetings,

Kris Kennaway wrote:
> http://www.freebsd.org/~kris/p4-net.tbz is a sys/ tarball from my p4
> branch, which includes these and other optimizations.
I have some problems with compiling new kernel:

cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -march=nocona
-std=c99 -g -Wall -Wredundant-decls -Wnested-externs
-Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline
-Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  
-I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL
-DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel
-mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  
-msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror  
/usr/src/sys/ufs/ufs/ufs_lookup.c
/usr/src/sys/ufs/ufs/ufs_lookup.c: In function 'ufs_lookup':
/usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: 'td' undeclared (first use
in this function)
/usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: (Each undeclared
identifier is reported only once
/usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: for each function it
appears in.)
/usr/src/sys/ufs/ufs/ufs_lookup.c:173:45: error: macro "VOP_LOCK" passed
3 arguments, but takes just 2
/usr/src/sys/ufs/ufs/ufs_lookup.c:173: error: 'VOP_LOCK' undeclared
(first use in this function)
*** Error code 1


--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Kris Kennaway-3
Stefan Lambrev wrote:

> Greetings,
>
> Kris Kennaway wrote:
>> http://www.freebsd.org/~kris/p4-net.tbz is a sys/ tarball from my p4
>> branch, which includes these and other optimizations.
> I have some problems with compiling new kernel:
>
> cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -march=nocona
> -std=c99 -g -Wall -Wredundant-decls -Wnested-externs
> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline
> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  
> -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL
> -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
> -finline-limit=8000 --param inline-unit-growth=100 --param
> large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel
> -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  
> -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror  
> /usr/src/sys/ufs/ufs/ufs_lookup.c
> /usr/src/sys/ufs/ufs/ufs_lookup.c: In function 'ufs_lookup':
> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: 'td' undeclared (first use
> in this function)
> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: (Each undeclared
> identifier is reported only once
> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: for each function it
> appears in.)
> /usr/src/sys/ufs/ufs/ufs_lookup.c:173:45: error: macro "VOP_LOCK" passed
> 3 arguments, but takes just 2
> /usr/src/sys/ufs/ufs/ufs_lookup.c:173: error: 'VOP_LOCK' undeclared
> (first use in this function)
> *** Error code 1
>
>

Sorry, forgot to check in a fix.  Apply this patch:

   http://www.freebsd.org/~kris/ufs_lookup.c.diff

(you'll need to specify the path by hand)

Kris
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2


Kris Kennaway wrote:

> Stefan Lambrev wrote:
>> Greetings,
>>
>> Kris Kennaway wrote:
>>> http://www.freebsd.org/~kris/p4-net.tbz is a sys/ tarball from my p4
>>> branch, which includes these and other optimizations.
>> I have some problems with compiling new kernel:
>>
>> cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -march=nocona
>> -std=c99 -g -Wall -Wredundant-decls -Wnested-externs
>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline
>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  
>> -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL
>> -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
>> -finline-limit=8000 --param inline-unit-growth=100 --param
>> large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel
>> -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  
>> -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror  
>> /usr/src/sys/ufs/ufs/ufs_lookup.c
>> /usr/src/sys/ufs/ufs/ufs_lookup.c: In function 'ufs_lookup':
>> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: 'td' undeclared (first
>> use in this function)
>> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: (Each undeclared
>> identifier is reported only once
>> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: for each function it
>> appears in.)
>> /usr/src/sys/ufs/ufs/ufs_lookup.c:173:45: error: macro "VOP_LOCK"
>> passed 3 arguments, but takes just 2
>> /usr/src/sys/ufs/ufs/ufs_lookup.c:173: error: 'VOP_LOCK' undeclared
>> (first use in this function)
>> *** Error code 1
>>
>>
>
> Sorry, forgot to check in a fix.  Apply this patch:
>
>   http://www.freebsd.org/~kris/ufs_lookup.c.diff
>
> (you'll need to specify the path by hand)
>
> Kris
Now this compiles, but :

cc -O2 -fno-strict-aliasing -pipe -march=nocona -Werror -D_KERNEL
-DKLD_MODULE -std=c99 -nostdinc   -DHAVE_KERNEL_OPTION_HEADERS -include
/usr/obj/usr/src/sys/CORE/opt_global.h -I. -I@ -I@/contrib/altq
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000 -fno-common -g -fno-omit-frame-pointer
-I/usr/obj/usr/src/sys/CORE -mcmodel=kernel -mno-red-zone  -mfpmath=387
-mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float
-fno-asynchronous-unwind-tables -ffreestanding -Wall -Wredundant-decls
-Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes
-Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign
-fformat-extensions -c
/usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c
cc1: warnings being treated as errors
/usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c: In
function 'lomac_thread_userret':
/usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2172:
warning: implicit declaration of function 'PROC_LOCK'
/usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2172:
warning: nested extern declaration of 'PROC_LOCK'
/usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2190:
warning: implicit declaration of function 'PROC_UNLOCK'
/usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2190:
warning: nested extern declaration of 'PROC_UNLOCK'
*** Error code 1

Stop in /usr/src/sys/modules/mac_lomac.

where -Werror is defined - in the Makefile? :)
Is it safe to just remove it for security/ and continue with build?

BTW I removed "options      ADAPTIVE_GIANT" as it is unknown with your
sources.

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Kris Kennaway-3
Stefan Lambrev wrote:

>
>
> Kris Kennaway wrote:
>> Stefan Lambrev wrote:
>>> Greetings,
>>>
>>> Kris Kennaway wrote:
>>>> http://www.freebsd.org/~kris/p4-net.tbz is a sys/ tarball from my p4
>>>> branch, which includes these and other optimizations.
>>> I have some problems with compiling new kernel:
>>>
>>> cc -c -O2 -frename-registers -pipe -fno-strict-aliasing -march=nocona
>>> -std=c99 -g -Wall -Wredundant-decls -Wnested-externs
>>> -Wstrict-prototypes  -Wmissing-prototypes -Wpointer-arith -Winline
>>> -Wcast-qual  -Wundef -Wno-pointer-sign -fformat-extensions -nostdinc  
>>> -I. -I/usr/src/sys -I/usr/src/sys/contrib/altq -D_KERNEL
>>> -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
>>> -finline-limit=8000 --param inline-unit-growth=100 --param
>>> large-function-growth=1000  -fno-omit-frame-pointer -mcmodel=kernel
>>> -mno-red-zone  -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  
>>> -msoft-float -fno-asynchronous-unwind-tables -ffreestanding -Werror  
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c: In function 'ufs_lookup':
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: 'td' undeclared (first
>>> use in this function)
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: (Each undeclared
>>> identifier is reported only once
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c:171: error: for each function it
>>> appears in.)
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c:173:45: error: macro "VOP_LOCK"
>>> passed 3 arguments, but takes just 2
>>> /usr/src/sys/ufs/ufs/ufs_lookup.c:173: error: 'VOP_LOCK' undeclared
>>> (first use in this function)
>>> *** Error code 1
>>>
>>>
>>
>> Sorry, forgot to check in a fix.  Apply this patch:
>>
>>   http://www.freebsd.org/~kris/ufs_lookup.c.diff
>>
>> (you'll need to specify the path by hand)
>>
>> Kris
> Now this compiles, but :
>
> cc -O2 -fno-strict-aliasing -pipe -march=nocona -Werror -D_KERNEL
> -DKLD_MODULE -std=c99 -nostdinc   -DHAVE_KERNEL_OPTION_HEADERS -include
> /usr/obj/usr/src/sys/CORE/opt_global.h -I. -I@ -I@/contrib/altq
> -finline-limit=8000 --param inline-unit-growth=100 --param
> large-function-growth=1000 -fno-common -g -fno-omit-frame-pointer
> -I/usr/obj/usr/src/sys/CORE -mcmodel=kernel -mno-red-zone  -mfpmath=387
> -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float
> -fno-asynchronous-unwind-tables -ffreestanding -Wall -Wredundant-decls
> -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes
> -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign
> -fformat-extensions -c
> /usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c
> cc1: warnings being treated as errors
> /usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c: In
> function 'lomac_thread_userret':
> /usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2172:
> warning: implicit declaration of function 'PROC_LOCK'
> /usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2172:
> warning: nested extern declaration of 'PROC_LOCK'
> /usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2190:
> warning: implicit declaration of function 'PROC_UNLOCK'
> /usr/src/sys/modules/mac_lomac/../../security/mac_lomac/mac_lomac.c:2190:
> warning: nested extern declaration of 'PROC_UNLOCK'
> *** Error code 1
>
> Stop in /usr/src/sys/modules/mac_lomac.
>
> where -Werror is defined - in the Makefile? :)
> Is it safe to just remove it for security/ and continue with build?
>
> BTW I removed "options      ADAPTIVE_GIANT" as it is unknown with your
> sources.

Yes, it is gone with 8.0.  Disable the module builds because some of
them like this one probably need compile fixes.  If you need a subset of
modules use MODULES_OVERRIDE=list (in /etc/make.conf)

Kris

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

Kris Kennaway wrote:
> Yes, it is gone with 8.0.  Disable the module builds because some of
> them like this one probably need compile fixes.  If you need a subset
> of modules use MODULES_OVERRIDE=list (in /etc/make.conf)
>
Yes, kernel builds.
I'm still playing with it, but the first results shows that new kernel
can handle 800k incoming packets (well may be more but I have not enough
power right now to generate more packets).
It still answer only to 250K-260K. I guess I'm hitting the limitation of
syncache/syncookies ?
Anyway this netisr2 looks like huge improvement :)

I can't build kernel without option LOCK_PROFILING with your sources:

make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES |  MKDEP_CPP="cc -E"
CC="cc" xargs mkdep -a -f .newdep -O2 -frename-registers -pipe
-fno-strict-aliasing -march=nocona -std=c99 -g -Wall -Wredundant-decls
-Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes
-Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign
-fformat-extensions -nostdinc  -I. -I/usr/src/sys
-I/usr/src/sys/contrib/altq -I/usr/src/sys/contrib/ipfilter
-I/usr/src/sys/contrib/pf -I/usr/src/sys/dev/ath
-I/usr/src/sys/contrib/ngatm -I/usr/src/sys/dev/twa
-I/usr/src/sys/gnu/fs/xfs/FreeBSD
-I/usr/src/sys/gnu/fs/xfs/FreeBSD/support -I/usr/src/sys/gnu/fs/xfs
-D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h -fno-common
-finline-limit=8000 --param inline-unit-growth=100 --param
large-function-growth=1000  -mcmodel=kernel -mno-red-zone  -mfpmath=387
-mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float
-fno-asynchronous-unwind-tables -ffreestanding
In file included from /usr/src/sys/netinet/ip_output.c:47:
/usr/src/sys/sys/rwlock.h:153:2: error: #error LOCK_DEBUG not defined,
include <sys/lock.h> before <sys/rwlock.h>
mkdep: compile failed
*** Error code 1

So I added #include <sys/lock.h>, rebuild kernel and tested again w/o
LOCK_PROFILING, but results are the same.

I'll use again hwpmc and LOCK_PROFILING to see what's going on.
And will try the same benchmark on quad core processor as now numbers of
cores/cpus matter :)

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: network performance

Stefan Lambrev-2
Greetings,

Stefan Lambrev wrote:

> Greetings,
>
> Kris Kennaway wrote:
>> Yes, it is gone with 8.0.  Disable the module builds because some of
>> them like this one probably need compile fixes.  If you need a subset
>> of modules use MODULES_OVERRIDE=list (in /etc/make.conf)
>>
> Yes, kernel builds.
> I'm still playing with it, but the first results shows that new kernel
> can handle 800k incoming packets (well may be more but I have not
> enough power right now to generate more packets).
> It still answer only to 250K-260K. I guess I'm hitting the limitation
> of syncache/syncookies ?
> Anyway this netisr2 looks like huge improvement :)
>
> I can't build kernel without option LOCK_PROFILING with your sources:
>
> make -V CFILES -V SYSTEM_CFILES -V GEN_CFILES |  MKDEP_CPP="cc -E"
> CC="cc" xargs mkdep -a -f .newdep -O2 -frename-registers -pipe
> -fno-strict-aliasing -march=nocona -std=c99 -g -Wall -Wredundant-decls
> -Wnested-externs -Wstrict-prototypes  -Wmissing-prototypes
> -Wpointer-arith -Winline -Wcast-qual  -Wundef -Wno-pointer-sign
> -fformat-extensions -nostdinc  -I. -I/usr/src/sys
> -I/usr/src/sys/contrib/altq -I/usr/src/sys/contrib/ipfilter
> -I/usr/src/sys/contrib/pf -I/usr/src/sys/dev/ath
> -I/usr/src/sys/contrib/ngatm -I/usr/src/sys/dev/twa
> -I/usr/src/sys/gnu/fs/xfs/FreeBSD
> -I/usr/src/sys/gnu/fs/xfs/FreeBSD/support -I/usr/src/sys/gnu/fs/xfs
> -D_KERNEL -DHAVE_KERNEL_OPTION_HEADERS -include opt_global.h
> -fno-common -finline-limit=8000 --param inline-unit-growth=100 --param
> large-function-growth=1000  -mcmodel=kernel -mno-red-zone  
> -mfpmath=387 -mno-sse -mno-sse2 -mno-mmx -mno-3dnow  -msoft-float
> -fno-asynchronous-unwind-tables -ffreestanding
> In file included from /usr/src/sys/netinet/ip_output.c:47:
> /usr/src/sys/sys/rwlock.h:153:2: error: #error LOCK_DEBUG not defined,
> include <sys/lock.h> before <sys/rwlock.h>
> mkdep: compile failed
> *** Error code 1
>
> So I added #include <sys/lock.h>, rebuild kernel and tested again w/o
> LOCK_PROFILING, but results are the same.
>
> I'll use again hwpmc and LOCK_PROFILING to see what's going on.
> And will try the same benchmark on quad core processor as now numbers
> of cores/cpus matter :)
>
Here are promised results - http://89.186.204.158/lock_profiling-8.txt
Btw I got kernel panic first time when I run sysctl debug.lock.prof.stats
I'm still trying to get hwpmc working with my cpu's and new kernel.
Do you have any patches Kris?
Is it supposed to work with your sources on my CPU?
I can fetch your latest src/lib/libpmc from from p4 if this will help :)

--

Best Wishes,
Stefan Lambrev
ICQ# 24134177

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
12