interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Ben Woods
Morning!

Since my recent update from FreeBSD12-current r313908 to r315466, I have
noticed some strange behaviour with one of my network interfaces.

The interface seems to work fine for a day or so, but on a number of
occasions I have found it to be down, and constantly outputting the
following message to the console every few seconds:
igb3: TX(7) desc avail = 41, pidx = 308
igb3: TX(7) desc avail = 41, pidx = 308
igb3: TX(7) desc avail = 41, pidx = 308
...

The problem is quickly worked around by issuing the following commands:
# service netif stop igb3
# service netif start igb3

Details of this particular network interface card:
$ pciconf -lv | grep igb3 -A4
igb3@pci0:0:20:1: class=0x020000 card=0x1f418086 chip=0x1f418086 rev=0x03
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I354'
    class       = network
    subclass  = ethernet

Any ideas what this could be, or how to investigate further?

Regards,
Ben

--
From: Benjamin Woods
[hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Kevin Bowling
Try turning TSO off.. i.e. ifconfig igb3 -tso or sysctl net.inet.tcp.tso=0

The transition to iflib has exposed much jankiness in the Intel "shared
code" of the e1000 drivers.  In particular, the locking contracts may not
align with FreeBSD locking primitives.  I have boxes running the legacy
driver that are clearly reliant on the watchdog reset for steady state
which is unacceptable.  We are actively looking into this at LLNW, but
additional reports and help are appreciated.

Regards,

On Fri, Mar 24, 2017 at 6:33 PM, Ben Woods <[hidden email]> wrote:

> Morning!
>
> Since my recent update from FreeBSD12-current r313908 to r315466, I have
> noticed some strange behaviour with one of my network interfaces.
>
> The interface seems to work fine for a day or so, but on a number of
> occasions I have found it to be down, and constantly outputting the
> following message to the console every few seconds:
> igb3: TX(7) desc avail = 41, pidx = 308
> igb3: TX(7) desc avail = 41, pidx = 308
> igb3: TX(7) desc avail = 41, pidx = 308
> ...
>
> The problem is quickly worked around by issuing the following commands:
> # service netif stop igb3
> # service netif start igb3
>
> Details of this particular network interface card:
> $ pciconf -lv | grep igb3 -A4
> igb3@pci0:0:20:1: class=0x020000 card=0x1f418086 chip=0x1f418086 rev=0x03
> hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Ethernet Connection I354'
>     class       = network
>     subclass  = ethernet
>
> Any ideas what this could be, or how to investigate further?
>
> Regards,
> Ben
>
> --
> From: Benjamin Woods
> [hidden email]
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Ben Woods
On 27 March 2017 at 15:35, Kevin Bowling <[hidden email]> wrote:

> Try turning TSO off.. i.e. ifconfig igb3 -tso or sysctl net.inet.tcp.tso=0
>
> The transition to iflib has exposed much jankiness in the Intel "shared
> code" of the e1000 drivers.  In particular, the locking contracts may not
> align with FreeBSD locking primitives.  I have boxes running the legacy
> driver that are clearly reliant on the watchdog reset for steady state
> which is unacceptable.  We are actively looking into this at LLNW, but
> additional reports and help are appreciated.
>


Hi Kevin,

Thanks for the reply. Sorry it took so long for me to get back to you, I
first had to wait for the problem to be repeated.

Indeed, running "ifconfig igb3 -tso" did fix the issue. It didn't seem to
the first time, but I may have been too quick to judge. After re-enabling
TSO and disabling it a second time, the problem stopped, and the interface
immediately came up.

Please let me know if there is anything I can do to try and help diagnose
this. In the mean time, I have added net.inet.tcp.tso=0 to my
/etc/sysctl.conf, so I don't think this will re-occur unless I remove it.

Regards,
Ben

--
From: Benjamin Woods
[hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Kevin Bowling
Sean Bruno committed a couple fixes to the watchdog code this week that
should at least allow for a usable TSO although the frequency of the
watchdog events is still cause for concern.  It seems some timeouts are
part of Intel's expectations during normal operations for several chipsets.

If you could share which exact NIC chipset you have I will check the
datasheets and see if we're missing anything.

Regards,

On Sat, Apr 1, 2017 at 6:41 PM, Ben Woods <[hidden email]> wrote:

> On 27 March 2017 at 15:35, Kevin Bowling <[hidden email]> wrote:
>
>> Try turning TSO off.. i.e. ifconfig igb3 -tso or sysctl net.inet.tcp.tso=0
>>
>> The transition to iflib has exposed much jankiness in the Intel "shared
>> code" of the e1000 drivers.  In particular, the locking contracts may not
>> align with FreeBSD locking primitives.  I have boxes running the legacy
>> driver that are clearly reliant on the watchdog reset for steady state
>> which is unacceptable.  We are actively looking into this at LLNW, but
>> additional reports and help are appreciated.
>>
>
>
> Hi Kevin,
>
> Thanks for the reply. Sorry it took so long for me to get back to you, I
> first had to wait for the problem to be repeated.
>
> Indeed, running "ifconfig igb3 -tso" did fix the issue. It didn't seem to
> the first time, but I may have been too quick to judge. After re-enabling
> TSO and disabling it a second time, the problem stopped, and the interface
> immediately came up.
>
> Please let me know if there is anything I can do to try and help diagnose
> this. In the mean time, I have added net.inet.tcp.tso=0 to my
> /etc/sysctl.conf, so I don't think this will re-occur unless I remove it.
>
> Regards,
> Ben
>
> --
> From: Benjamin Woods
> [hidden email]
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Ben Woods
On 2 April 2017 at 16:04, Kevin Bowling <[hidden email]> wrote:

> Sean Bruno committed a couple fixes to the watchdog code this week that
> should at least allow for a usable TSO although the frequency of the
> watchdog events is still cause for concern.  It seems some timeouts are
> part of Intel's expectations during normal operations for several chipsets.
>
> If you could share which exact NIC chipset you have I will check the
> datasheets and see if we're missing anything.
>
>

Hi Kevin,

Thanks for the reply. More details about my chipset below.

$ dmesg | grep igb3
igb3: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3020-0x303f mem
0xdfec0000-0xdfedffff,0xdff28000-0xdff2bfff irq 19 at device 20.1 on pci0
igb3: attach_pre capping queues at 8
igb3: using 1024 tx descriptors and 1024 rx descriptors
igb3: msix_init qsets capped at 8
igb3: pxm cpus: 8 queue msgs: 9 admincnt: 1
igb3: using 8 rx queues 8 tx queues
igb3: Using MSIX interrupts with 9 vectors
igb3: allocated for 8 tx_queues
igb3: allocated for 8 rx_queues
igb3: Ethernet address: 00:08:a2:09:3c:75
igb3: netmap queues/slots: TX 8/1024, RX 8/1024
igb3: promiscuous mode enabled
igb3: link state changed to UP


$ pciconf -lvv | grep igb3 -A4
igb3@pci0:0:20:1:       class=0x020000 card=0x1f418086 chip=0x1f418086
rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I354'
    class      = network
    subclass   = ethernet

Regards,
Ben

--
From: Benjamin Woods
[hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Sean Bruno-7
In reply to this post by Ben Woods


On 03/24/17 19:33, Ben Woods wrote:

> Morning!
>
> Since my recent update from FreeBSD12-current r313908 to r315466, I have
> noticed some strange behaviour with one of my network interfaces.
>
> The interface seems to work fine for a day or so, but on a number of
> occasions I have found it to be down, and constantly outputting the
> following message to the console every few seconds:
> igb3: TX(7) desc avail = 41, pidx = 308
> igb3: TX(7) desc avail = 41, pidx = 308
> igb3: TX(7) desc avail = 41, pidx = 308
> ...
>
> The problem is quickly worked around by issuing the following commands:
> # service netif stop igb3
> # service netif start igb3
>
> Details of this particular network interface card:
> $ pciconf -lv | grep igb3 -A4
> igb3@pci0:0:20:1: class=0x020000 card=0x1f418086 chip=0x1f418086 rev=0x03
> hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Ethernet Connection I354'
>     class       = network
>     subclass  = ethernet
>
> Any ideas what this could be, or how to investigate further?
>
> Regards,
> Ben
>
> --
> From: Benjamin Woods
> [hidden email]
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "[hidden email]"
>

Ben:

What kind of workload is this machine processing?  I'd like to try and
duplicate this failure if possible.

sean


signature.asc (631 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Ben Woods
>
> Ben:
>
> What kind of workload is this machine processing?  I'd like to try and
> duplicate this failure if possible.
>
> sean
>
>
Hi Sean,

It is a Netgate RCC-VE-8860 running as my home firewall.
https://netgate.com/docs/rcc-ve-8860/quick-start-guide.html

I am running FreeBSD 12-current r315466, using pf, dnsmasq, powerd,
hostapd, openntpd, nfs_client, vnstat, salt_minion, ng_netflow, and ppp as
a pppoe client.
The latter 2 are only running on the WAN interface igb1.
I have igb0, igb2, igb3, igb4 and wlan0 bridged together, with a static ip
as my LAN gateway assigned to bridge0.

I am seeing the error on igb3, which is the interface connected to my
desktop PC (dual booted Windows and FreeBSD).

$ dmesg -a | grep "Intel(R)"
CPU: Intel(R) Atom(TM) CPU  C2758  @ 2.40GHz (2400.06-MHz K8-class CPU)
igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x1000-0x101f mem
0xdfc00000-0xdfc1ffff,0xdfc20000-0xdfc23fff irq 18 at device 0.0 on pci3
igb1: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x2000-0x201f mem
0xdfd00000-0xdfd1ffff,0xdfd20000-0xdfd23fff irq 19 at device 0.0 on pci4
igb2: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3000-0x301f mem
0xdfea0000-0xdfebffff,0xdff24000-0xdff27fff irq 18 at device 20.0 on pci0
igb3: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3020-0x303f mem
0xdfec0000-0xdfedffff,0xdff28000-0xdff2bfff irq 19 at device 20.1 on pci0
igb4: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3040-0x305f mem
0xdfee0000-0xdfefffff,0xdff2c000-0xdff2ffff irq 20 at device 20.2 on pci0
igb5: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3060-0x307f mem
0xdff00000-0xdff1ffff,0xdff30000-0xdff33fff irq 21 at device 20.3 on pci0


$ pciconf -lv | grep -i net -B2
igb2@pci0:0:20:0:    class=0x020000 card=0x1f418086 chip=0x1f418086
rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I354'
    class      = network
    subclass   = ethernet
igb3@pci0:0:20:1:    class=0x020000 card=0x1f418086 chip=0x1f418086
rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I354'
    class      = network
    subclass   = ethernet
igb4@pci0:0:20:2:    class=0x020000 card=0x1f418086 chip=0x1f418086
rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I354'
    class      = network
    subclass   = ethernet
igb5@pci0:0:20:3:    class=0x020000 card=0x1f418086 chip=0x1f418086
rev=0x03 hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'Ethernet Connection I354'
    class      = network
    subclass   = ethernet
--
ath0@pci0:1:0:0:    class=0x028000 card=0x3099168c chip=0x002a168c rev=0x01
hdr=0x00
    vendor     = 'Qualcomm Atheros'
    device     = 'AR928X Wireless Network Adapter (PCI-Express)'
    class      = network
igb0@pci0:3:0:0:    class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I211 Gigabit Network Connection'
    class      = network
    subclass   = ethernet
igb1@pci0:4:0:0:    class=0x020000 card=0x00008086 chip=0x15398086 rev=0x03
hdr=0x00
    vendor     = 'Intel Corporation'
    device     = 'I211 Gigabit Network Connection'
    class      = network
    subclass   = ethernet


Regards,
Ben

--
From: Benjamin Woods
[hidden email]
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: interface down, console output: igb3: TX(7) desc avail = 41, pidx = 308

Sean Bruno-7


On 04/11/17 09:39, Ben Woods wrote:

>     Ben:
>
>     What kind of workload is this machine processing?  I'd like to try and
>     duplicate this failure if possible.
>
>     sean
>
>
> Hi Sean,
>
> It is a Netgate RCC-VE-8860 running as my home firewall.
> https://netgate.com/docs/rcc-ve-8860/quick-start-guide.html
Ok, so *a lot* of packet forwarding?

sean

>
> I am running FreeBSD 12-current r315466, using pf, dnsmasq, powerd,
> hostapd, openntpd, nfs_client, vnstat, salt_minion, ng_netflow, and ppp
> as a pppoe client.
> The latter 2 are only running on the WAN interface igb1.
> I have igb0, igb2, igb3, igb4 and wlan0 bridged together, with a static
> ip as my LAN gateway assigned to bridge0.
>
> I am seeing the error on igb3, which is the interface connected to my
> desktop PC (dual booted Windows and FreeBSD).
>
> $ dmesg -a | grep "Intel(R)"
> CPU: Intel(R) Atom(TM) CPU  C2758  @ 2.40GHz (2400.06-MHz K8-class CPU)
> igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x1000-0x101f
> mem 0xdfc00000-0xdfc1ffff,0xdfc20000-0xdfc23fff irq 18 at device 0.0 on pci3
> igb1: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x2000-0x201f
> mem 0xdfd00000-0xdfd1ffff,0xdfd20000-0xdfd23fff irq 19 at device 0.0 on pci4
> igb2: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3000-0x301f
> mem 0xdfea0000-0xdfebffff,0xdff24000-0xdff27fff irq 18 at device 20.0 on
> pci0
> igb3: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3020-0x303f
> mem 0xdfec0000-0xdfedffff,0xdff28000-0xdff2bfff irq 19 at device 20.1 on
> pci0
> igb4: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3040-0x305f
> mem 0xdfee0000-0xdfefffff,0xdff2c000-0xdff2ffff irq 20 at device 20.2 on
> pci0
> igb5: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0x3060-0x307f
> mem 0xdff00000-0xdff1ffff,0xdff30000-0xdff33fff irq 21 at device 20.3 on
> pci0
>
>
> $ pciconf -lv | grep -i net -B2
> igb2@pci0:0:20:0:    class=0x020000 card=0x1f418086 chip=0x1f418086
> rev=0x03 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Ethernet Connection I354'
>     class      = network
>     subclass   = ethernet
> igb3@pci0:0:20:1:    class=0x020000 card=0x1f418086 chip=0x1f418086
> rev=0x03 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Ethernet Connection I354'
>     class      = network
>     subclass   = ethernet
> igb4@pci0:0:20:2:    class=0x020000 card=0x1f418086 chip=0x1f418086
> rev=0x03 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Ethernet Connection I354'
>     class      = network
>     subclass   = ethernet
> igb5@pci0:0:20:3:    class=0x020000 card=0x1f418086 chip=0x1f418086
> rev=0x03 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'Ethernet Connection I354'
>     class      = network
>     subclass   = ethernet
> --
> ath0@pci0:1:0:0:    class=0x028000 card=0x3099168c chip=0x002a168c
> rev=0x01 hdr=0x00
>     vendor     = 'Qualcomm Atheros'
>     device     = 'AR928X Wireless Network Adapter (PCI-Express)'
>     class      = network
> igb0@pci0:3:0:0:    class=0x020000 card=0x00008086 chip=0x15398086
> rev=0x03 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'I211 Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
> igb1@pci0:4:0:0:    class=0x020000 card=0x00008086 chip=0x15398086
> rev=0x03 hdr=0x00
>     vendor     = 'Intel Corporation'
>     device     = 'I211 Gigabit Network Connection'
>     class      = network
>     subclass   = ethernet
>
>
> Regards,
> Ben
>
> --
> From: Benjamin Woods
> [hidden email] <mailto:[hidden email]>


signature.asc (631 bytes) Download Attachment