PCBGROUP + RSS: problem establishing an outgoing TCP connection

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

PCBGROUP + RSS: problem establishing an outgoing TCP connection

Andriy Gapon

This happens on an EC2 instance with ixv driver.
When I try to establish an outgoing TCP connection I see the following exchange.
Local side sends SYN, it receives SYN+ACK and immediately sends RST.
I tracked this down to in_pcblookup_mbuf() failing to find the corresponding inpcb.

I dug a bit deeper and this is my understanding of the issue.

When tcp_connect() calls in_pcbrehash() the inpcb gets placed into a group
determined by in_pcbgroup_bytuple() [see in_pcbgroup_update and
in_pcbgroup_byinpcb].  The inpcb does not have INP_RSS_BUCKET_SET.  Both
addresses and ports are populated at that time.

When the reply packet is received, in_pcblookup_mbuf() uses in_pcbgroup_byhash()
to look up the group because the packet has M_HASHTYPE_RSS_TCP_IPV4.
The problem is that in_pcbgroup_byhash() returns a different group and the inpcb
cannot be found.

I am very new to this code, so I would appreciate any help with further
debugging and root causing the problem.

Thank you!

--
Andriy Gapon
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

ixv + PCBGROUP + RSS: problem establishing an outgoing TCP connection

Andriy Gapon
On 10/09/2019 12:14, Andriy Gapon wrote:
>
> This happens on an EC2 instance with ixv driver.

I wonder if anyone ever tested ixv with PCBGROUP...
I see a trivial but severe bug.
if_ixv.c does not include opt_rss.h.  Because of this IXGBE_FEATURE_RSS gets
defined to zero (in ixgbe_features.h).  So, instead of of using rss_getkey() to
get the RSS key, the driver just generates a random one.
No surprise then that the hardware (VF) produces totally different hashes.
But maybe that's not all.

On top of that, I wonder why the driver enables RSS in the hardware when feat_en
does not have IXGBE_FEATURE_RSS.
Could anyone please explain the logic behind that?
Please see ixv_initialize_rss_mapping.
For example:
        if (adapter->feat_en & IXGBE_FEATURE_RSS) {
                /* Fetch the configured RSS key */
                rss_getkey((uint8_t *)&rss_key);
        } else {
                /* set up random bits */
                arc4rand(&rss_key, sizeof(rss_key), 0);
        }
And so on.

Additionally, I found this bit of information:
The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is: The
hash and key are shared among PF and all VF, the RETA table with 128 entries is
also shared among PF and all VF; So it could not to provide a method to query
the hash and reta content per VF on guest, while, if possible, please query them
on host for the shared RETA information.

And my "hardware" is exactly 82599 VF.
I hacked the driver to not call ixv_initialize_rss_mapping() at all, but even
with that change the packet descriptors had IXGBE_RXDADV_RSSTYPE_IPV4_TCP in
pkt_info.  Maybe it's because of how PF was configured.
So, I wonder if ixgbe_isc_rxd_pkt_get() should be modified to not set iri_flowid
and iri_rsstype under some conditions.

> When I try to establish an outgoing TCP connection I see the following exchange.
> Local side sends SYN, it receives SYN+ACK and immediately sends RST.
> I tracked this down to in_pcblookup_mbuf() failing to find the corresponding inpcb.
>
> I dug a bit deeper and this is my understanding of the issue.
>
> When tcp_connect() calls in_pcbrehash() the inpcb gets placed into a group
> determined by in_pcbgroup_bytuple() [see in_pcbgroup_update and
> in_pcbgroup_byinpcb].  The inpcb does not have INP_RSS_BUCKET_SET.  Both
> addresses and ports are populated at that time.
>
> When the reply packet is received, in_pcblookup_mbuf() uses in_pcbgroup_byhash()
> to look up the group because the packet has M_HASHTYPE_RSS_TCP_IPV4.
> The problem is that in_pcbgroup_byhash() returns a different group and the inpcb
> cannot be found.
>
> I am very new to this code, so I would appreciate any help with further
> debugging and root causing the problem.




--
Andriy Gapon
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: ixv + PCBGROUP + RSS: problem establishing an outgoing TCP connection

Anish Gupta
>if_ixv.c does not include opt_rss.h.  Because of this IXGBE_FEATURE_RSS
gets
defined to zero (in ixgbe_features.h).  So, instead of of using
rss_getkey() to
get the RSS key, the driver just generates a random one.

I have not looked at ixv but as far as I understand our build process,
opt_rss.h need to be included.

-Anish

On Tue, Sep 10, 2019 at 11:16 PM Andriy Gapon <[hidden email]> wrote:

> On 10/09/2019 12:14, Andriy Gapon wrote:
> >
> > This happens on an EC2 instance with ixv driver.
>
> I wonder if anyone ever tested ixv with PCBGROUP...
> I see a trivial but severe bug.
> if_ixv.c does not include opt_rss.h.  Because of this IXGBE_FEATURE_RSS
> gets
> defined to zero (in ixgbe_features.h).  So, instead of of using
> rss_getkey() to
> get the RSS key, the driver just generates a random one.
> No surprise then that the hardware (VF) produces totally different hashes.
> But maybe that's not all.
>
> On top of that, I wonder why the driver enables RSS in the hardware when
> feat_en
> does not have IXGBE_FEATURE_RSS.
> Could anyone please explain the logic behind that?
> Please see ixv_initialize_rss_mapping.
> For example:
>         if (adapter->feat_en & IXGBE_FEATURE_RSS) {
>                 /* Fetch the configured RSS key */
>                 rss_getkey((uint8_t *)&rss_key);
>         } else {
>                 /* set up random bits */
>                 arc4rand(&rss_key, sizeof(rss_key), 0);
>         }
> And so on.
>
> Additionally, I found this bit of information:
> The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller
> is: The
> hash and key are shared among PF and all VF, the RETA table with 128
> entries is
> also shared among PF and all VF; So it could not to provide a method to
> query
> the hash and reta content per VF on guest, while, if possible, please
> query them
> on host for the shared RETA information.
>
> And my "hardware" is exactly 82599 VF.
> I hacked the driver to not call ixv_initialize_rss_mapping() at all, but
> even
> with that change the packet descriptors had IXGBE_RXDADV_RSSTYPE_IPV4_TCP
> in
> pkt_info.  Maybe it's because of how PF was configured.
> So, I wonder if ixgbe_isc_rxd_pkt_get() should be modified to not set
> iri_flowid
> and iri_rsstype under some conditions.
>
> > When I try to establish an outgoing TCP connection I see the following
> exchange.
> > Local side sends SYN, it receives SYN+ACK and immediately sends RST.
> > I tracked this down to in_pcblookup_mbuf() failing to find the
> corresponding inpcb.
> >
> > I dug a bit deeper and this is my understanding of the issue.
> >
> > When tcp_connect() calls in_pcbrehash() the inpcb gets placed into a
> group
> > determined by in_pcbgroup_bytuple() [see in_pcbgroup_update and
> > in_pcbgroup_byinpcb].  The inpcb does not have INP_RSS_BUCKET_SET.  Both
> > addresses and ports are populated at that time.
> >
> > When the reply packet is received, in_pcblookup_mbuf() uses
> in_pcbgroup_byhash()
> > to look up the group because the packet has M_HASHTYPE_RSS_TCP_IPV4.
> > The problem is that in_pcbgroup_byhash() returns a different group and
> the inpcb
> > cannot be found.
> >
> > I am very new to this code, so I would appreciate any help with further
> > debugging and root causing the problem.
>
>
>
>
> --
> Andriy Gapon
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-net
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: ixv + PCBGROUP + RSS: problem establishing an outgoing TCP connection

Andriy Gapon
In reply to this post by Andriy Gapon

Here is a change that I would like to propose based on the earlier observations
and findings: https://reviews.freebsd.org/D21705

On 11/09/2019 09:15, Andriy Gapon wrote:

> On 10/09/2019 12:14, Andriy Gapon wrote:
>>
>> This happens on an EC2 instance with ixv driver.
>
> I wonder if anyone ever tested ixv with PCBGROUP...
> I see a trivial but severe bug.
> if_ixv.c does not include opt_rss.h.  Because of this IXGBE_FEATURE_RSS gets
> defined to zero (in ixgbe_features.h).  So, instead of of using rss_getkey() to
> get the RSS key, the driver just generates a random one.
> No surprise then that the hardware (VF) produces totally different hashes.
> But maybe that's not all.
>
> On top of that, I wonder why the driver enables RSS in the hardware when feat_en
> does not have IXGBE_FEATURE_RSS.
> Could anyone please explain the logic behind that?
> Please see ixv_initialize_rss_mapping.
> For example:
>         if (adapter->feat_en & IXGBE_FEATURE_RSS) {
>                 /* Fetch the configured RSS key */
>                 rss_getkey((uint8_t *)&rss_key);
>         } else {
>                 /* set up random bits */
>                 arc4rand(&rss_key, sizeof(rss_key), 0);
>         }
> And so on.
>
> Additionally, I found this bit of information:
> The limitation for VF RSS on Intel® 82599 10 Gigabit Ethernet Controller is: The
> hash and key are shared among PF and all VF, the RETA table with 128 entries is
> also shared among PF and all VF; So it could not to provide a method to query
> the hash and reta content per VF on guest, while, if possible, please query them
> on host for the shared RETA information.
>
> And my "hardware" is exactly 82599 VF.
> I hacked the driver to not call ixv_initialize_rss_mapping() at all, but even
> with that change the packet descriptors had IXGBE_RXDADV_RSSTYPE_IPV4_TCP in
> pkt_info.  Maybe it's because of how PF was configured.
> So, I wonder if ixgbe_isc_rxd_pkt_get() should be modified to not set iri_flowid
> and iri_rsstype under some conditions.
>
>> When I try to establish an outgoing TCP connection I see the following exchange.
>> Local side sends SYN, it receives SYN+ACK and immediately sends RST.
>> I tracked this down to in_pcblookup_mbuf() failing to find the corresponding inpcb.
>>
>> I dug a bit deeper and this is my understanding of the issue.
>>
>> When tcp_connect() calls in_pcbrehash() the inpcb gets placed into a group
>> determined by in_pcbgroup_bytuple() [see in_pcbgroup_update and
>> in_pcbgroup_byinpcb].  The inpcb does not have INP_RSS_BUCKET_SET.  Both
>> addresses and ports are populated at that time.
>>
>> When the reply packet is received, in_pcblookup_mbuf() uses in_pcbgroup_byhash()
>> to look up the group because the packet has M_HASHTYPE_RSS_TCP_IPV4.
>> The problem is that in_pcbgroup_byhash() returns a different group and the inpcb
>> cannot be found.
>>
>> I am very new to this code, so I would appreciate any help with further
>> debugging and root causing the problem.
>
>
>
>


--
Andriy Gapon
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-net
To unsubscribe, send any mail to "[hidden email]"