FreeBSD 10.1-REL - network unaccessible after high traffic

classic Classic list List threaded Threaded
6 messages Options
bcs
Reply | Threaded
Open this post in threaded view
|

FreeBSD 10.1-REL - network unaccessible after high traffic

bcs
Hi all,

I have two FreeBSd 10.1-RELEASE servers connected to each other. They
were connected via cross link, but they are connected to a cisco switch
now. When transferring huge files (50-500GB backup files) via Gigabit
(it is important!) the network randomly dies. The backup runs every
day/week and sometimes the connection is ok for months sometimes it
happens twice a week. When the network dies I can log in to the server
via IPMI and use the console everything is OK, but can't send anything
out on the network. ifconfig em0 down/up doesn't help nor netif restart.
The problem never occured when I used 100Mbit connection between them,
but it was 3com NIC (xl), gigabit adapter is Intel (em0). When I limit
the transfer rate (rsync bandwith limit or ipfw pipe) the problem is
much more rare.

I tried to set these tuning parameters on both servers with different
buffer size but nothing helped:

# cat /etc/sysctl.conf
security.bsd.see_other_uids=0
net.inet.tcp.recvspace=512000
net.route.netisr_maxqlen=2048
kern.ipc.nmbclusters=1310720
net.inet.tcp.sendbuf_max=16777216
net.inet.tcp.recvbuf_max=16777216
kern.ipc.soacceptqueue=32768

# cat /boot/loader.conf
geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8))
ipfw_load="YES"
net.inet.ip.fw.default_to_accept=1
kern.maxusers=4096
accf_data_load="YES"

Any ideas? Thanks guys!
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: FreeBSD 10.1-REL - network unaccessible after high traffic

Julian Elischer-5
On 5/24/15 11:12 PM, Cs wrote:

> Hi all,
>
> I have two FreeBSd 10.1-RELEASE servers connected to each other.
> They were connected via cross link, but they are connected to a
> cisco switch now. When transferring huge files (50-500GB backup
> files) via Gigabit (it is important!) the network randomly dies. The
> backup runs every day/week and sometimes the connection is ok for
> months sometimes it happens twice a week. When the network dies I
> can log in to the server via IPMI and use the console everything is
> OK, but can't send anything out on the network. ifconfig em0 down/up
> doesn't help nor netif restart. The problem never occured when I
> used 100Mbit connection between them, but it was 3com NIC (xl),
> gigabit adapter is Intel (em0). When I limit the transfer rate
> (rsync bandwith limit or ipfw pipe) the problem is much more rare.

did you have the problem with no switch?
is he duplex setting correct?

>
> I tried to set these tuning parameters on both servers with
> different buffer size but nothing helped:
>
> # cat /etc/sysctl.conf
> security.bsd.see_other_uids=0
> net.inet.tcp.recvspace=512000
> net.route.netisr_maxqlen=2048
> kern.ipc.nmbclusters=1310720
> net.inet.tcp.sendbuf_max=16777216
> net.inet.tcp.recvbuf_max=16777216
> kern.ipc.soacceptqueue=32768
>
> # cat /boot/loader.conf
> geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8))
> ipfw_load="YES"
> net.inet.ip.fw.default_to_accept=1
> kern.maxusers=4096
> accf_data_load="YES"
>
> Any ideas? Thanks guys!
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "[hidden email]"
>

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
bcs
Reply | Threaded
Open this post in threaded view
|

Re: FreeBSD 10.1-REL - network unaccessible after high traffic

bcs
Hi Julian,

Yes, the problem was the same when I used cross link for two years.
The duplex settings are identical on both servers.

Server A:
em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
         ether 00:25:90:24:52:66
         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
         media: Ethernet autoselect (1000baseT <full-duplex>)
         status: active

Server B:
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
         ether 00:30:48:dd:fe:3e
         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
         media: Ethernet autoselect (1000baseT <full-duplex>)
         status: active

I always suspected the 'em' driver and thought "it will be fixed in the
next release", but after ~3 years I think I need to dig deep to find the
root cause.

Regards,
Csaba

2015.05.25. 4:30 keltezéssel, Julian Elischer írta:

> On 5/24/15 11:12 PM, Cs wrote:
>> Hi all,
>>
>> I have two FreeBSd 10.1-RELEASE servers connected to each other. They
>> were connected via cross link, but they are connected to a cisco
>> switch now. When transferring huge files (50-500GB backup files) via
>> Gigabit (it is important!) the network randomly dies. The backup runs
>> every day/week and sometimes the connection is ok for months
>> sometimes it happens twice a week. When the network dies I can log in
>> to the server via IPMI and use the console everything is OK, but
>> can't send anything out on the network. ifconfig em0 down/up doesn't
>> help nor netif restart. The problem never occured when I used 100Mbit
>> connection between them, but it was 3com NIC (xl), gigabit adapter is
>> Intel (em0). When I limit the transfer rate (rsync bandwith limit or
>> ipfw pipe) the problem is much more rare.
>
> did you have the problem with no switch?
> is he duplex setting correct?
>
>>
>> I tried to set these tuning parameters on both servers with different
>> buffer size but nothing helped:
>>
>> # cat /etc/sysctl.conf
>> security.bsd.see_other_uids=0
>> net.inet.tcp.recvspace=512000
>> net.route.netisr_maxqlen=2048
>> kern.ipc.nmbclusters=1310720
>> net.inet.tcp.sendbuf_max=16777216
>> net.inet.tcp.recvbuf_max=16777216
>> kern.ipc.soacceptqueue=32768
>>
>> # cat /boot/loader.conf
>> geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8))
>> ipfw_load="YES"
>> net.inet.ip.fw.default_to_accept=1
>> kern.maxusers=4096
>> accf_data_load="YES"
>>
>> Any ideas? Thanks guys!
>> _______________________________________________
>> [hidden email] mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to
>> "[hidden email]"
>>
>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "[hidden email]"

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
bcs
Reply | Threaded
Open this post in threaded view
|

Re: FreeBSD 10.1-REL - network unaccessible after high traffic

bcs
Hi All,

Julian gave me an idea to increase MTU to 9000 but I'm not sure if it
helps. Anyway I increased it on both servers.
@Julian: the loader.conf and sysctl tuning parameters are ok?

Regards,
Csaba

2015.05.25. 8:34 keltezéssel, Cs írta:

> Hi Julian,
>
> Yes, the problem was the same when I used cross link for two years.
> The duplex settings are identical on both servers.
>
> Server A:
> em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
>
>         ether 00:25:90:24:52:66
>         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
>
> Server B:
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
> options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
>
>         ether 00:30:48:dd:fe:3e
>         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>         media: Ethernet autoselect (1000baseT <full-duplex>)
>         status: active
>
> I always suspected the 'em' driver and thought "it will be fixed in
> the next release", but after ~3 years I think I need to dig deep to
> find the root cause.
>
> Regards,
> Csaba
>
> 2015.05.25. 4:30 keltezéssel, Julian Elischer írta:
>> On 5/24/15 11:12 PM, Cs wrote:
>>> Hi all,
>>>
>>> I have two FreeBSd 10.1-RELEASE servers connected to each other.
>>> They were connected via cross link, but they are connected to a
>>> cisco switch now. When transferring huge files (50-500GB backup
>>> files) via Gigabit (it is important!) the network randomly dies. The
>>> backup runs every day/week and sometimes the connection is ok for
>>> months sometimes it happens twice a week. When the network dies I
>>> can log in to the server via IPMI and use the console everything is
>>> OK, but can't send anything out on the network. ifconfig em0 down/up
>>> doesn't help nor netif restart. The problem never occured when I
>>> used 100Mbit connection between them, but it was 3com NIC (xl),
>>> gigabit adapter is Intel (em0). When I limit the transfer rate
>>> (rsync bandwith limit or ipfw pipe) the problem is much more rare.
>>
>> did you have the problem with no switch?
>> is he duplex setting correct?
>>
>>>
>>> I tried to set these tuning parameters on both servers with
>>> different buffer size but nothing helped:
>>>
>>> # cat /etc/sysctl.conf
>>> security.bsd.see_other_uids=0
>>> net.inet.tcp.recvspace=512000
>>> net.route.netisr_maxqlen=2048
>>> kern.ipc.nmbclusters=1310720
>>> net.inet.tcp.sendbuf_max=16777216
>>> net.inet.tcp.recvbuf_max=16777216
>>> kern.ipc.soacceptqueue=32768
>>>
>>> # cat /boot/loader.conf
>>> geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8))
>>> ipfw_load="YES"
>>> net.inet.ip.fw.default_to_accept=1
>>> kern.maxusers=4096
>>> accf_data_load="YES"
>>>
>>> Any ideas? Thanks guys!
>>> _______________________________________________
>>> [hidden email] mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>> To unsubscribe, send any mail to
>>> "[hidden email]"
>>>
>>
>> _______________________________________________
>> [hidden email] mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to
>> "[hidden email]"
>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "[hidden email]"

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: FreeBSD 10.1-REL - network unaccessible after high traffic

Julian Elischer-5
On 5/25/15 2:54 PM, Cs wrote:
> Hi All,
>
> Julian gave me an idea to increase MTU to 9000 but I'm not sure if
> it helps. Anyway I increased it on both servers.
> @Julian: the loader.conf and sysctl tuning parameters are ok?

better mailing list might be -net.

I don't see any obvious problems.
  It may be worth trying to  turn off all the advanced features like
TSO etc.


>
> Regards,
> Csaba
>
> 2015.05.25. 8:34 keltezéssel, Cs írta:
>> Hi Julian,
>>
>> Yes, the problem was the same when I used cross link for two years.
>> The duplex settings are identical on both servers.
>>
>> Server A:
>> em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0
>> mtu 1500
>> options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
>>
>>         ether 00:25:90:24:52:66
>>         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>         media: Ethernet autoselect (1000baseT <full-duplex>)
>>         status: active
>>
>> Server B:
>> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0
>> mtu 1500
>> options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
>>
>>         ether 00:30:48:dd:fe:3e
>>         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>         media: Ethernet autoselect (1000baseT <full-duplex>)
>>         status: active
>>
>> I always suspected the 'em' driver and thought "it will be fixed in
>> the next release", but after ~3 years I think I need to dig deep to
>> find the root cause.
>>
>> Regards,
>> Csaba
>>
>> 2015.05.25. 4:30 keltezéssel, Julian Elischer írta:
>>> On 5/24/15 11:12 PM, Cs wrote:
>>>> Hi all,
>>>>
>>>> I have two FreeBSd 10.1-RELEASE servers connected to each other.
>>>> They were connected via cross link, but they are connected to a
>>>> cisco switch now. When transferring huge files (50-500GB backup
>>>> files) via Gigabit (it is important!) the network randomly dies.
>>>> The backup runs every day/week and sometimes the connection is ok
>>>> for months sometimes it happens twice a week. When the network
>>>> dies I can log in to the server via IPMI and use the console
>>>> everything is OK, but can't send anything out on the network.
>>>> ifconfig em0 down/up doesn't help nor netif restart. The problem
>>>> never occured when I used 100Mbit connection between them, but it
>>>> was 3com NIC (xl), gigabit adapter is Intel (em0). When I limit
>>>> the transfer rate (rsync bandwith limit or ipfw pipe) the problem
>>>> is much more rare.
>>>
>>> did you have the problem with no switch?
>>> is he duplex setting correct?
>>>
>>>>
>>>> I tried to set these tuning parameters on both servers with
>>>> different buffer size but nothing helped:
>>>>
>>>> # cat /etc/sysctl.conf
>>>> security.bsd.see_other_uids=0
>>>> net.inet.tcp.recvspace=512000
>>>> net.route.netisr_maxqlen=2048
>>>> kern.ipc.nmbclusters=1310720
>>>> net.inet.tcp.sendbuf_max=16777216
>>>> net.inet.tcp.recvbuf_max=16777216
>>>> kern.ipc.soacceptqueue=32768
>>>>
>>>> # cat /boot/loader.conf
>>>> geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8))
>>>> ipfw_load="YES"
>>>> net.inet.ip.fw.default_to_accept=1
>>>> kern.maxusers=4096
>>>> accf_data_load="YES"
>>>>
>>>> Any ideas? Thanks guys!
>>>> _______________________________________________
>>>> [hidden email] mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>>> To unsubscribe, send any mail to
>>>> "[hidden email]"
>>>>
>>>
>>> _______________________________________________
>>> [hidden email] mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>> To unsubscribe, send any mail to
>>> "[hidden email]"
>>
>> _______________________________________________
>> [hidden email] mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to
>> "[hidden email]"
>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
> To unsubscribe, send any mail to
> "[hidden email]"
>
>
>

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
bcs
Reply | Threaded
Open this post in threaded view
|

Re: FreeBSD 10.1-REL - network unaccessible after high traffic

bcs
Thanks, will try that!

2015.05.25. 9:25 keltezéssel, Julian Elischer írta:

> On 5/25/15 2:54 PM, Cs wrote:
>> Hi All,
>>
>> Julian gave me an idea to increase MTU to 9000 but I'm not sure if it
>> helps. Anyway I increased it on both servers.
>> @Julian: the loader.conf and sysctl tuning parameters are ok?
>
> better mailing list might be -net.
>
> I don't see any obvious problems.
>  It may be worth trying to  turn off all the advanced features like
> TSO etc.
>
>
>>
>> Regards,
>> Csaba
>>
>> 2015.05.25. 8:34 keltezéssel, Cs írta:
>>> Hi Julian,
>>>
>>> Yes, the problem was the same when I used cross link for two years.
>>> The duplex settings are identical on both servers.
>>>
>>> Server A:
>>> em1: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
>>> 1500
>>> options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
>>>
>>>         ether 00:25:90:24:52:66
>>>         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>>>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>>         media: Ethernet autoselect (1000baseT <full-duplex>)
>>>         status: active
>>>
>>> Server B:
>>> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu
>>> 1500
>>> options=4219b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,TSO4,WOL_MAGIC,VLAN_HWTSO>
>>>
>>>         ether 00:30:48:dd:fe:3e
>>>         inet x.x.x.x netmask 0xfffffe00 broadcast x.x.x.x
>>>         nd6 options=29<PERFORMNUD,IFDISABLED,AUTO_LINKLOCAL>
>>>         media: Ethernet autoselect (1000baseT <full-duplex>)
>>>         status: active
>>>
>>> I always suspected the 'em' driver and thought "it will be fixed in
>>> the next release", but after ~3 years I think I need to dig deep to
>>> find the root cause.
>>>
>>> Regards,
>>> Csaba
>>>
>>> 2015.05.25. 4:30 keltezéssel, Julian Elischer írta:
>>>> On 5/24/15 11:12 PM, Cs wrote:
>>>>> Hi all,
>>>>>
>>>>> I have two FreeBSd 10.1-RELEASE servers connected to each other.
>>>>> They were connected via cross link, but they are connected to a
>>>>> cisco switch now. When transferring huge files (50-500GB backup
>>>>> files) via Gigabit (it is important!) the network randomly dies.
>>>>> The backup runs every day/week and sometimes the connection is ok
>>>>> for months sometimes it happens twice a week. When the network
>>>>> dies I can log in to the server via IPMI and use the console
>>>>> everything is OK, but can't send anything out on the network.
>>>>> ifconfig em0 down/up doesn't help nor netif restart. The problem
>>>>> never occured when I used 100Mbit connection between them, but it
>>>>> was 3com NIC (xl), gigabit adapter is Intel (em0). When I limit
>>>>> the transfer rate (rsync bandwith limit or ipfw pipe) the problem
>>>>> is much more rare.
>>>>
>>>> did you have the problem with no switch?
>>>> is he duplex setting correct?
>>>>
>>>>>
>>>>> I tried to set these tuning parameters on both servers with
>>>>> different buffer size but nothing helped:
>>>>>
>>>>> # cat /etc/sysctl.conf
>>>>> security.bsd.see_other_uids=0
>>>>> net.inet.tcp.recvspace=512000
>>>>> net.route.netisr_maxqlen=2048
>>>>> kern.ipc.nmbclusters=1310720
>>>>> net.inet.tcp.sendbuf_max=16777216
>>>>> net.inet.tcp.recvbuf_max=16777216
>>>>> kern.ipc.soacceptqueue=32768
>>>>>
>>>>> # cat /boot/loader.conf
>>>>> geom_mirror_load="YES" # RAID1 disk driver (see gmirror(8))
>>>>> ipfw_load="YES"
>>>>> net.inet.ip.fw.default_to_accept=1
>>>>> kern.maxusers=4096
>>>>> accf_data_load="YES"
>>>>>
>>>>> Any ideas? Thanks guys!
>>>>> _______________________________________________
>>>>> [hidden email] mailing list
>>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>>>> To unsubscribe, send any mail to
>>>>> "[hidden email]"
>>>>>
>>>>
>>>> _______________________________________________
>>>> [hidden email] mailing list
>>>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>>> To unsubscribe, send any mail to
>>>> "[hidden email]"
>>>
>>> _______________________________________________
>>> [hidden email] mailing list
>>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>>> To unsubscribe, send any mail to
>>> "[hidden email]"
>>
>> _______________________________________________
>> [hidden email] mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-performance
>> To unsubscribe, send any mail to
>> "[hidden email]"
>>
>>
>>
>

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"