xen+vimage kernel panic

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

xen+vimage kernel panic

Nathan Friess
Hi,

While testing out the new PVH support in a domU (which is running
great!), I discovered a kernel panic related to xen and vimage support
when trying to add an xn interface into a bridge.

I'm running r337024 from svn.  Removing vimage (which seems to be turned
on in 12-CURRENT now) allows using the bridge with no panics.  As part
of attempting to debug this I enabled vimage in my 11.2 domU and that
also panics in the same code.

I'm not sure if the problem is a xen issue or a vimage issue so I
haven't submitted a PR yet.  The kernel output is listed below.

It looks like netfront_backend_changed() calls netfront_send_fake_arp(),
which calls arp_ifinit() on the interface.  The first line of the call
stack with arprequest+0x454 corresponds to a call to
ARPSTAT_INC(txrequests) at the end of arprequest, which expands to
VNET_PCPUSTAT_ADD().  I tried to debug further and I got a little lost,
but that's where I figured out that vimage is involved somehow.

Are there any thoughts on why the xn interface would cause a panic there?

Thanks,

Nathan




=======

Steps to reproduce:

# ifconfig bridge create
bridge0
# ifconfig bridge0 addm xn0
(panic...)


======

Kernel output:

xn0: performing interface reset due to feature change
(... lock reversal)
xn0: backend features: feature-sg feature-gso-tcp4


Fatal trap 12: page fault while in kernel mode
cpuid = 1; apic id = 02
fault virtual address = 0x28
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80d15db4
stack pointer        = 0x0:0xfffffe0000483840
frame pointer        = 0x0:0xfffffe0000483940
code segment = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 14 (xenwatch)
[ thread pid 14 tid 100033 ]
Stopped at      arprequest+0x454:       movq    ll+0x7(%rax),%rax

db> bt
Tracing pid 14 tid 100033 td 0xfffff800032f5000
arprequest() at arprequest+0x454/frame 0xfffffe0000483940
arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe0000483980
netfront_backend_changed() at netfront_backend_changed+0x144/frame
0xfffffe0000483a40
xenwatch_thread() at xenwatch_thread+0x182/frame 0xfffffe0000483a70
fork_exit() at fork_exit+0x84/frame 0xfffffe0000483ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000483ab0

======

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: xen+vimage kernel panic

Marko Zec-2
On Sun, 19 Aug 2018 12:50:55 -0600
Nathan Friess <[hidden email]> wrote:

> Hi,
>
> While testing out the new PVH support in a domU (which is running
> great!), I discovered a kernel panic related to xen and vimage
> support when trying to add an xn interface into a bridge.
>
> I'm running r337024 from svn.  Removing vimage (which seems to be
> turned on in 12-CURRENT now) allows using the bridge with no panics.
> As part of attempting to debug this I enabled vimage in my 11.2 domU
> and that also panics in the same code.
>
> I'm not sure if the problem is a xen issue or a vimage issue so I
> haven't submitted a PR yet.  The kernel output is listed below.
>
> It looks like netfront_backend_changed() calls
> netfront_send_fake_arp(), which calls arp_ifinit() on the interface.
> The first line of the call stack with arprequest+0x454 corresponds to
> a call to ARPSTAT_INC(txrequests) at the end of arprequest, which
> expands to VNET_PCPUSTAT_ADD().  I tried to debug further and I got a
> little lost, but that's where I figured out that vimage is involved
> somehow.
>
> Are there any thoughts on why the xn interface would cause a panic
> there?
The xn driver calls arp_ifinit() without setting the vnet context
first.  Perhaps the attached patch could help (not even compile
tested...)

Marko


>
> Thanks,
>
> Nathan
>
>
>
>
> =======
>
> Steps to reproduce:
>
> # ifconfig bridge create
> bridge0
> # ifconfig bridge0 addm xn0
> (panic...)
>
>
> ======
>
> Kernel output:
>
> xn0: performing interface reset due to feature change
> (... lock reversal)
> xn0: backend features: feature-sg feature-gso-tcp4
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 1; apic id = 02
> fault virtual address = 0x28
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff80d15db4
> stack pointer        = 0x0:0xfffffe0000483840
> frame pointer        = 0x0:0xfffffe0000483940
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 14 (xenwatch)
> [ thread pid 14 tid 100033 ]
> Stopped at      arprequest+0x454:       movq    ll+0x7(%rax),%rax
>
> db> bt  
> Tracing pid 14 tid 100033 td 0xfffff800032f5000
> arprequest() at arprequest+0x454/frame 0xfffffe0000483940
> arp_ifinit() at arp_ifinit+0x58/frame 0xfffffe0000483980
> netfront_backend_changed() at netfront_backend_changed+0x144/frame
> 0xfffffe0000483a40
> xenwatch_thread() at xenwatch_thread+0x182/frame 0xfffffe0000483a70
> fork_exit() at fork_exit+0x84/frame 0xfffffe0000483ab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfffffe0000483ab0
>
> ======
>
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-xen
> To unsubscribe, send any mail to "[hidden email]"

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "[hidden email]"

xn_vnet.diff (494 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: xen+vimage kernel panic

Roger Pau Monné
On Sun, Aug 19, 2018 at 10:48:52PM +0200, Marko Zec wrote:

> On Sun, 19 Aug 2018 12:50:55 -0600
> Nathan Friess <[hidden email]> wrote:
>
> > Hi,
> >
> > While testing out the new PVH support in a domU (which is running
> > great!), I discovered a kernel panic related to xen and vimage
> > support when trying to add an xn interface into a bridge.
> >
> > I'm running r337024 from svn.  Removing vimage (which seems to be
> > turned on in 12-CURRENT now) allows using the bridge with no panics.
> > As part of attempting to debug this I enabled vimage in my 11.2 domU
> > and that also panics in the same code.
> >
> > I'm not sure if the problem is a xen issue or a vimage issue so I
> > haven't submitted a PR yet.  The kernel output is listed below.
> >
> > It looks like netfront_backend_changed() calls
> > netfront_send_fake_arp(), which calls arp_ifinit() on the interface.
> > The first line of the call stack with arprequest+0x454 corresponds to
> > a call to ARPSTAT_INC(txrequests) at the end of arprequest, which
> > expands to VNET_PCPUSTAT_ADD().  I tried to debug further and I got a
> > little lost, but that's where I figured out that vimage is involved
> > somehow.
> >
> > Are there any thoughts on why the xn interface would cause a panic
> > there?
>
> The xn driver calls arp_ifinit() without setting the vnet context
> first.  Perhaps the attached patch could help (not even compile
> tested...)

I know nothing about VNET, so is this initialization required now that
VNET is enabled? Is this an existing bug in netfront that was harmless
before VNET was activated?

Can you please file a bug report and attach the patch?

Thanks, Roger.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: xen+vimage kernel panic

Nathan Friess
On 2018-08-20 03:49 AM, Roger Pau Monné wrote:

>>> Are there any thoughts on why the xn interface would cause a panic
>>> there?
>>
>> The xn driver calls arp_ifinit() without setting the vnet context
>> first.  Perhaps the attached patch could help (not even compile
>> tested...)
>
> I know nothing about VNET, so is this initialization required now that
> VNET is enabled? Is this an existing bug in netfront that was harmless
> before VNET was activated?
>
> Can you please file a bug report and attach the patch?

Hi Roger and everyone,

My apologies for not opening the bug report earlier this week.  I tested
the patch this weekend and it indeed does fix the panic in my domUs.

Cheers,

Nathan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-xen
To unsubscribe, send any mail to "[hidden email]"