Commit 367705+367706 causes a pabic

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Commit 367705+367706 causes a pabic

Peter Blok
Hi,

I’m afraid the last Epoch fix for bridge is not solving the problem ( or perhaps creates a new ).

This seems to happen when the jail epair is added to the bridge.

Removing both fixes solves the problem.

Peter


kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address = 0xc10
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80695e76
stack pointer        = 0x28:0xfffffe00bf14e6e0
frame pointer        = 0x28:0xfffffe00bf14e720
code segment = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 1686 (jail)
trap number = 12
panic: page fault
cpuid = 6
time = 1605811310
KDB: stack backtrace:
#0 0xffffffff8069bb85 at kdb_backtrace+0x65
#1 0xffffffff80650a4b at vpanic+0x17b
#2 0xffffffff806508c3 at panic+0x43
#3 0xffffffff809d0351 at trap_fatal+0x391
#4 0xffffffff809d03af at trap_pfault+0x4f
#5 0xffffffff809cf9f6 at trap+0x286
#6 0xffffffff809a98c8 at calltrap+0x8
#7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
#8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
#9 0xffffffff80757d40 at vnet_if_init+0x120
#10 0xffffffff8078c994 at vnet_alloc+0x114
#11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
#12 0xffffffff80620190 at sys_jail_set+0x40
#13 0xffffffff809d0f07 at amd64_syscall+0x387
#14 0xffffffff809aa1ee at fast_syscall_common+0xf8

smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Kristof Provost
On 20 Nov 2020, at 11:18, [hidden email] wrote:
> I’m afraid the last Epoch fix for bridge is not solving the problem
> ( or perhaps creates a new ).
>
We’re talking about the stable/12 branch, right?

> This seems to happen when the jail epair is added to the bridge.
>
There must be something more to it than that. I’ve run the bridge
tests on stable/12 without issue, and this is a problem we didn’t see
when the bridge epochification initially went into stable/12.

Do you have a custom kernel config? Other patches? What exact commands
do you run to trigger the panic?

> kernel trap 12 with interrupts disabled
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 6; apic id = 06
> fault virtual address = 0xc10
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff80695e76
> stack pointer        = 0x28:0xfffffe00bf14e6e0
> frame pointer        = 0x28:0xfffffe00bf14e720
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = resume, IOPL = 0
> current process = 1686 (jail)
> trap number = 12
> panic: page fault
> cpuid = 6
> time = 1605811310
> KDB: stack backtrace:
> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
> #1 0xffffffff80650a4b at vpanic+0x17b
> #2 0xffffffff806508c3 at panic+0x43
> #3 0xffffffff809d0351 at trap_fatal+0x391
> #4 0xffffffff809d03af at trap_pfault+0x4f
> #5 0xffffffff809cf9f6 at trap+0x286
> #6 0xffffffff809a98c8 at calltrap+0x8
> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
> #9 0xffffffff80757d40 at vnet_if_init+0x120
> #10 0xffffffff8078c994 at vnet_alloc+0x114
> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
> #12 0xffffffff80620190 at sys_jail_set+0x40
> #13 0xffffffff809d0f07 at amd64_syscall+0x387
> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8

This panic is rather odd. This isn’t even the bridge code. This is
during initial creation of the vnet. I don’t really see how this could
even trigger panics.
That panic looks as if something corrupted the net_epoch_preempt, by
overwriting the epoch->e_epoch. The bridge patches only access this
variable through the well-established functions and macros. I see no
obvious way that they could corrupt it.

Best regards,
Kristof
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Peter Blok
Hi Kristof,

This is 12-stable. With the previous bridge epochification that was backed out my config had a panic too.

I don’t have any local modifications. I did a clean rebuild after removing /usr/obj/usr

My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and nmdm.ko as modules. Everything else is statically linked. I have removed all drivers not needed for the hardware at hand.

My bridge is between two vlans from the same trunk and the jail epair devices as well as the bhyve tap devices.

The panic happens when the jails are starting.

I can try to narrow it down over the weekend and make the crash dump available for analysis.

Previously I had the following crash with 363492

kernel trap 12 with interrupts disabled


Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 02
fault virtual address = 0xffffffff00000410
fault code = supervisor read data, page not present
instruction pointer = 0x20:0xffffffff80692326
stack pointer        = 0x28:0xfffffe00c06097b0
frame pointer        = 0x28:0xfffffe00c06097f0
code segment = base 0x0, limit 0xfffff, type 0x1b
                        = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags = resume, IOPL = 0
current process = 2030 (ifconfig)
trap number = 12
panic: page fault
cpuid = 2
time = 1595683412
KDB: stack backtrace:
#0 0xffffffff80698165 at kdb_backtrace+0x65
#1 0xffffffff8064d67b at vpanic+0x17b
#2 0xffffffff8064d4f3 at panic+0x43
#3 0xffffffff809cc311 at trap_fatal+0x391
#4 0xffffffff809cc36f at trap_pfault+0x4f
#5 0xffffffff809cb9b6 at trap+0x286
#6 0xffffffff809a5b28 at calltrap+0x8
#7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
#8 0xffffffff8069213a at epoch_wait_preempt+0xaa
#9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
#10 0xffffffff8075274f at ifioctl+0x47f
#11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
#12 0xffffffff806b5b4a at sys_ioctl+0xfa
#13 0xffffffff809ccec7 at amd64_syscall+0x387
#14 0xffffffff809a6450 at fast_syscall_common+0x101




> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>
> On 20 Nov 2020, at 11:18, [hidden email] <mailto:[hidden email]> wrote:
>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or perhaps creates a new ).
>>
> We’re talking about the stable/12 branch, right?
>
>> This seems to happen when the jail epair is added to the bridge.
>>
> There must be something more to it than that. I’ve run the bridge tests on stable/12 without issue, and this is a problem we didn’t see when the bridge epochification initially went into stable/12.
>
> Do you have a custom kernel config? Other patches? What exact commands do you run to trigger the panic?
>
>> kernel trap 12 with interrupts disabled
>>
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 6; apic id = 06
>> fault virtual address = 0xc10
>> fault code = supervisor read data, page not present
>> instruction pointer = 0x20:0xffffffff80695e76
>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>> frame pointer        = 0x28:0xfffffe00bf14e720
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = resume, IOPL = 0
>> current process = 1686 (jail)
>> trap number = 12
>> panic: page fault
>> cpuid = 6
>> time = 1605811310
>> KDB: stack backtrace:
>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>> #1 0xffffffff80650a4b at vpanic+0x17b
>> #2 0xffffffff806508c3 at panic+0x43
>> #3 0xffffffff809d0351 at trap_fatal+0x391
>> #4 0xffffffff809d03af at trap_pfault+0x4f
>> #5 0xffffffff809cf9f6 at trap+0x286
>> #6 0xffffffff809a98c8 at calltrap+0x8
>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>> #12 0xffffffff80620190 at sys_jail_set+0x40
>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>
> This panic is rather odd. This isn’t even the bridge code. This is during initial creation of the vnet. I don’t really see how this could even trigger panics.
> That panic looks as if something corrupted the net_epoch_preempt, by overwriting the epoch->e_epoch. The bridge patches only access this variable through the well-established functions and macros. I see no obvious way that they could corrupt it.
>
> Best regards,
> Kristof


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Kristof Provost
Can you share your kernel config file (and src.conf / make.conf if they
exist)?

This second panic is in the IPSec code. My current thinking is that your
kernel config is triggering a bug that’s manifesting in multiple
places, but not actually caused by those places.

I’d like to be able to reproduce it so we can debug it.

Best regards,
Kristof

On 20 Nov 2020, at 12:02, Peter Blok wrote:

> Hi Kristof,
>
> This is 12-stable. With the previous bridge epochification that was
> backed out my config had a panic too.
>
> I don’t have any local modifications. I did a clean rebuild after
> removing /usr/obj/usr
>
> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and
> nmdm.ko as modules. Everything else is statically linked. I have
> removed all drivers not needed for the hardware at hand.
>
> My bridge is between two vlans from the same trunk and the jail epair
> devices as well as the bhyve tap devices.
>
> The panic happens when the jails are starting.
>
> I can try to narrow it down over the weekend and make the crash dump
> available for analysis.
>
> Previously I had the following crash with 363492
>
> kernel trap 12 with interrupts disabled
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 2; apic id = 02
> fault virtual address = 0xffffffff00000410
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff80692326
> stack pointer        = 0x28:0xfffffe00c06097b0
> frame pointer        = 0x28:0xfffffe00c06097f0
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = resume, IOPL = 0
> current process = 2030 (ifconfig)
> trap number = 12
> panic: page fault
> cpuid = 2
> time = 1595683412
> KDB: stack backtrace:
> #0 0xffffffff80698165 at kdb_backtrace+0x65
> #1 0xffffffff8064d67b at vpanic+0x17b
> #2 0xffffffff8064d4f3 at panic+0x43
> #3 0xffffffff809cc311 at trap_fatal+0x391
> #4 0xffffffff809cc36f at trap_pfault+0x4f
> #5 0xffffffff809cb9b6 at trap+0x286
> #6 0xffffffff809a5b28 at calltrap+0x8
> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
> #10 0xffffffff8075274f at ifioctl+0x47f
> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
> #13 0xffffffff809ccec7 at amd64_syscall+0x387
> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>
>
>
>
>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>>
>> On 20 Nov 2020, at 11:18, [hidden email]
>> <mailto:[hidden email]> wrote:
>>> I’m afraid the last Epoch fix for bridge is not solving the
>>> problem ( or perhaps creates a new ).
>>>
>> We’re talking about the stable/12 branch, right?
>>
>>> This seems to happen when the jail epair is added to the bridge.
>>>
>> There must be something more to it than that. I’ve run the bridge
>> tests on stable/12 without issue, and this is a problem we didn’t
>> see when the bridge epochification initially went into stable/12.
>>
>> Do you have a custom kernel config? Other patches? What exact
>> commands do you run to trigger the panic?
>>
>>> kernel trap 12 with interrupts disabled
>>>
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>> cpuid = 6; apic id = 06
>>> fault virtual address = 0xc10
>>> fault code = supervisor read data, page not present
>>> instruction pointer = 0x20:0xffffffff80695e76
>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>> processor eflags = resume, IOPL = 0
>>> current process = 1686 (jail)
>>> trap number = 12
>>> panic: page fault
>>> cpuid = 6
>>> time = 1605811310
>>> KDB: stack backtrace:
>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>> #2 0xffffffff806508c3 at panic+0x43
>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>> #5 0xffffffff809cf9f6 at trap+0x286
>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>
>> This panic is rather odd. This isn’t even the bridge code. This is
>> during initial creation of the vnet. I don’t really see how this
>> could even trigger panics.
>> That panic looks as if something corrupted the net_epoch_preempt, by
>> overwriting the epoch->e_epoch. The bridge patches only access this
>> variable through the well-established functions and macros. I see no
>> obvious way that they could corrupt it.
>>
>> Best regards,
>> Kristof


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Peter Blok
The panic with ipsec code in the backtrace was already very strange. I was using IPsec, but only on one interface totally separate from the members of the bridge as well as the bridge itself. The jails were not doing any ipsec as well. Note that panic was a while ago and it was after the 1st bridge epochification was done on stable-12 which was later backed out

Today the system is no longer using ipsec, but it is still compiled in. I can remove it if need be for a test


src.conf
WITHOUT_KERBEROS=yes
WITHOUT_GSSAPI=yes
WITHOUT_SENDMAIL=true
WITHOUT_MAILWRAPPER=true
WITHOUT_DMAGENT=true
WITHOUT_GAMES=true
WITHOUT_IPFILTER=true
WITHOUT_UNBOUND=true
WITHOUT_PROFILE=true
WITHOUT_ATM=true
WITHOUT_BSNMP=true
#WITHOUT_CROSS_COMPILER=true
WITHOUT_DEBUG_FILES=true
WITHOUT_DICT=true
WITHOUT_FLOPPY=true
WITHOUT_HTML=true
WITHOUT_HYPERV=true
WITHOUT_NDIS=true
WITHOUT_NIS=true
WITHOUT_PPP=true
WITHOUT_TALK=true
WITHOUT_TESTS=true
WITHOUT_WIRELESS=true
#WITHOUT_LIB32=true
WITHOUT_LPR=true

make.conf
KERNCONF=BHYVE
MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common linuxkpi linprocfs linsysfs ext2fs
DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
OPTIONS_UNSET=DOCS NLS MANPAGES

BHYVE
cpu HAMMER
ident BHYVE

makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support

options CAMDEBUG

options SCHED_ULE # ULE scheduler
options PREEMPTION # Enable kernel thread preemption
options INET # InterNETworking
options INET6 # IPv6 communications protocols
options IPSEC
options TCP_OFFLOAD # TCP offload
options TCP_RFC7413 # TCP FASTOPEN
options SCTP # Stream Control Transmission Protocol
options FFS # Berkeley Fast Filesystem
options SOFTUPDATES # Enable FFS soft updates support
options UFS_ACL # Support for access control lists
options UFS_DIRHASH # Improve performance on big directories
options UFS_GJOURNAL # Enable gjournal-based UFS journaling
options QUOTA # Enable disk quotas for UFS
options SUIDDIR
options NFSCL # Network Filesystem Client
options NFSD # Network Filesystem Server
options NFSLOCKD # Network Lock Manager
options MSDOSFS # MSDOS Filesystem
options CD9660 # ISO 9660 Filesystem
options FUSEFS
options NULLFS # NULL filesystem
options UNIONFS
options FDESCFS # File descriptor filesystem
options PROCFS # Process filesystem (requires PSEUDOFS)
options PSEUDOFS # Pseudo-filesystem framework
options GEOM_PART_GPT # GUID Partition Tables.
options GEOM_RAID # Soft RAID functionality.
options GEOM_LABEL # Provides labelization
options GEOM_ELI # Disk encryption.
options COMPAT_FREEBSD32 # Compatible with i386 binaries
options COMPAT_FREEBSD4 # Compatible with FreeBSD4
options COMPAT_FREEBSD5 # Compatible with FreeBSD5
options COMPAT_FREEBSD6 # Compatible with FreeBSD6
options COMPAT_FREEBSD7 # Compatible with FreeBSD7
options COMPAT_FREEBSD9 # Compatible with FreeBSD9
options COMPAT_FREEBSD10 # Compatible with FreeBSD10
options COMPAT_FREEBSD11 # Compatible with FreeBSD11
options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
options KTRACE # ktrace(1) support
options STACK # stack(9) support
options SYSVSHM # SYSV-style shared memory
options SYSVMSG # SYSV-style message queues
options SYSVSEM # SYSV-style semaphores
options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed.
options KBD_INSTALL_CDEV # install a CDEV entry in /dev
options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
options AUDIT # Security event auditing
options CAPABILITY_MODE # Capsicum capability mode
options CAPABILITIES # Capsicum capabilities
options MAC # TrustedBSD MAC Framework
options MAC_PORTACL
options MAC_NTPD
options KDTRACE_FRAME # Ensure frames are compiled in
options KDTRACE_HOOKS # Kernel DTrace hooks
options DDB_CTF # Kernel ELF linker loads CTF data
options INCLUDE_CONFIG_FILE # Include this file in kernel

# Debugging support.  Always need this:
options KDB # Enable kernel debugger support.
options KDB_TRACE # Print a stack trace for a panic.
options KDB_UNATTENDED

# Make an SMP-capable kernel by default
options SMP # Symmetric MultiProcessor Kernel
options EARLY_AP_STARTUP

# CPU frequency control
device cpufreq
device cpuctl
device coretemp

# Bus support.
device acpi
options ACPI_DMAR
device pci
options PCI_IOV # PCI SR-IOV support

device iicbus
device iicbb

device iic
device ic
device iicsmb

device ichsmb
device smbus
device smb

#device jedec_dimm

# ATA controllers
device ahci # AHCI-compatible SATA controllers
device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA

# SCSI Controllers
device mps # LSI-Logic MPT-Fusion 2

# ATA/SCSI peripherals
device scbus # SCSI bus (required for ATA/SCSI)
device da # Direct Access (disks)
device cd # CD
device pass # Passthrough device (direct ATA/SCSI access)
device ses # Enclosure Services (SES and SAF-TE)
device sg

device cfiscsi
device ctl # CAM Target Layer
device iscsi

# atkbdc0 controls both the keyboard and the PS/2 mouse
device atkbdc # AT keyboard controller
device atkbd # AT keyboard
device psm # PS/2 mouse

device kbdmux # keyboard multiplexer

# vt is the new video console driver
device vt
device vt_vga
device vt_efifb

# Serial (COM) ports
device uart # Generic UART driver

# PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
device iflib
device em # Intel PRO/1000 Gigabit Ethernet Family
device ix # Intel PRO/10GbE PCIE PF Ethernet

# Network stack virtualization.
options VIMAGE

# Pseudo devices.
device crypto
device cryptodev
device loop # Network loopback
device random # Entropy device
device padlock_rng # VIA Padlock RNG
device rdrand_rng # Intel Bull Mountain RNG
device ipmi
device smbios
device vpd
device aesni # AES-NI OpenCrypto module
device ether # Ethernet support
device lagg
device vlan # 802.1Q VLAN support
device tuntap # Packet tunnel.
device md # Memory "disks"
device gif # IPv6 and IPv4 tunneling
device firmware # firmware assist module

device pf
#device pflog
#device pfsync

# The `bpf' device enables the Berkeley Packet Filter.
# Be aware of the administrative consequences of enabling this!
# Note that 'bpf' is required for DHCP.
device bpf # Berkeley packet filter

# The `epair' device implements a virtual back-to-back connected Ethernet
# like interface pair.
device epair

# USB support
options USB_DEBUG # enable debug msgs
device uhci # UHCI PCI->USB interface
device ohci # OHCI PCI->USB interface
device ehci # EHCI PCI->USB interface (USB 2.0)
device xhci # XHCI PCI->USB interface (USB 3.0)
device usb # USB Bus (required)
device uhid
device ukbd # Keyboard
device umass # Disks/Mass storage - Requires scbus and da
device ums

device filemon

device if_bridge

> On 20 Nov 2020, at 12:53, Kristof Provost <[hidden email]> wrote:
>
> Can you share your kernel config file (and src.conf / make.conf if they exist)?
>
> This second panic is in the IPSec code. My current thinking is that your kernel config is triggering a bug that’s manifesting in multiple places, but not actually caused by those places.
>
> I’d like to be able to reproduce it so we can debug it.
>
> Best regards,
> Kristof
>
> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>> Hi Kristof,
>>
>> This is 12-stable. With the previous bridge epochification that was backed out my config had a panic too.
>>
>> I don’t have any local modifications. I did a clean rebuild after removing /usr/obj/usr
>>
>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and nmdm.ko as modules. Everything else is statically linked. I have removed all drivers not needed for the hardware at hand.
>>
>> My bridge is between two vlans from the same trunk and the jail epair devices as well as the bhyve tap devices.
>>
>> The panic happens when the jails are starting.
>>
>> I can try to narrow it down over the weekend and make the crash dump available for analysis.
>>
>> Previously I had the following crash with 363492
>>
>> kernel trap 12 with interrupts disabled
>>
>>
>> Fatal trap 12: page fault while in kernel mode
>> cpuid = 2; apic id = 02
>> fault virtual address = 0xffffffff00000410
>> fault code = supervisor read data, page not present
>> instruction pointer = 0x20:0xffffffff80692326
>> stack pointer        = 0x28:0xfffffe00c06097b0
>> frame pointer        = 0x28:0xfffffe00c06097f0
>> code segment = base 0x0, limit 0xfffff, type 0x1b
>> = DPL 0, pres 1, long 1, def32 0, gran 1
>> processor eflags = resume, IOPL = 0
>> current process = 2030 (ifconfig)
>> trap number = 12
>> panic: page fault
>> cpuid = 2
>> time = 1595683412
>> KDB: stack backtrace:
>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>> #1 0xffffffff8064d67b at vpanic+0x17b
>> #2 0xffffffff8064d4f3 at panic+0x43
>> #3 0xffffffff809cc311 at trap_fatal+0x391
>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>> #5 0xffffffff809cb9b6 at trap+0x286
>> #6 0xffffffff809a5b28 at calltrap+0x8
>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>> #10 0xffffffff8075274f at ifioctl+0x47f
>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>
>>
>>
>>
>>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>>>
>>> On 20 Nov 2020, at 11:18, [hidden email] <mailto:[hidden email]> wrote:
>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or perhaps creates a new ).
>>>>
>>> We’re talking about the stable/12 branch, right?
>>>
>>>> This seems to happen when the jail epair is added to the bridge.
>>>>
>>> There must be something more to it than that. I’ve run the bridge tests on stable/12 without issue, and this is a problem we didn’t see when the bridge epochification initially went into stable/12.
>>>
>>> Do you have a custom kernel config? Other patches? What exact commands do you run to trigger the panic?
>>>
>>>> kernel trap 12 with interrupts disabled
>>>>
>>>>
>>>> Fatal trap 12: page fault while in kernel mode
>>>> cpuid = 6; apic id = 06
>>>> fault virtual address = 0xc10
>>>> fault code = supervisor read data, page not present
>>>> instruction pointer = 0x20:0xffffffff80695e76
>>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>> processor eflags = resume, IOPL = 0
>>>> current process = 1686 (jail)
>>>> trap number = 12
>>>> panic: page fault
>>>> cpuid = 6
>>>> time = 1605811310
>>>> KDB: stack backtrace:
>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>> #2 0xffffffff806508c3 at panic+0x43
>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>
>>> This panic is rather odd. This isn’t even the bridge code. This is during initial creation of the vnet. I don’t really see how this could even trigger panics.
>>> That panic looks as if something corrupted the net_epoch_preempt, by overwriting the epoch->e_epoch. The bridge patches only access this variable through the well-established functions and macros. I see no obvious way that they could corrupt it.
>>>
>>> Best regards,
>>> Kristof
>
>
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Kristof Provost
I still can’t reproduce that panic.

Does it happen immediately after you start a vnet jail?

Does it also happen with a GENERIC kernel?

Regards,
Kristof

On 20 Nov 2020, at 14:53, Peter Blok wrote:

> The panic with ipsec code in the backtrace was already very strange. I
> was using IPsec, but only on one interface totally separate from the
> members of the bridge as well as the bridge itself. The jails were not
> doing any ipsec as well. Note that panic was a while ago and it was
> after the 1st bridge epochification was done on stable-12 which was
> later backed out
>
> Today the system is no longer using ipsec, but it is still compiled
> in. I can remove it if need be for a test
>
>
> src.conf
> WITHOUT_KERBEROS=yes
> WITHOUT_GSSAPI=yes
> WITHOUT_SENDMAIL=true
> WITHOUT_MAILWRAPPER=true
> WITHOUT_DMAGENT=true
> WITHOUT_GAMES=true
> WITHOUT_IPFILTER=true
> WITHOUT_UNBOUND=true
> WITHOUT_PROFILE=true
> WITHOUT_ATM=true
> WITHOUT_BSNMP=true
> #WITHOUT_CROSS_COMPILER=true
> WITHOUT_DEBUG_FILES=true
> WITHOUT_DICT=true
> WITHOUT_FLOPPY=true
> WITHOUT_HTML=true
> WITHOUT_HYPERV=true
> WITHOUT_NDIS=true
> WITHOUT_NIS=true
> WITHOUT_PPP=true
> WITHOUT_TALK=true
> WITHOUT_TESTS=true
> WITHOUT_WIRELESS=true
> #WITHOUT_LIB32=true
> WITHOUT_LPR=true
>
> make.conf
> KERNCONF=BHYVE
> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp
> if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common
> linuxkpi linprocfs linsysfs ext2fs
> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
> OPTIONS_UNSET=DOCS NLS MANPAGES
>
> BHYVE
> cpu HAMMER
> ident BHYVE
>
> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
> makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support
>
> options CAMDEBUG
>
> options SCHED_ULE # ULE scheduler
> options PREEMPTION # Enable kernel thread preemption
> options INET # InterNETworking
> options INET6 # IPv6 communications protocols
> options IPSEC
> options TCP_OFFLOAD # TCP offload
> options TCP_RFC7413 # TCP FASTOPEN
> options SCTP # Stream Control Transmission Protocol
> options FFS # Berkeley Fast Filesystem
> options SOFTUPDATES # Enable FFS soft updates support
> options UFS_ACL # Support for access control lists
> options UFS_DIRHASH # Improve performance on big directories
> options UFS_GJOURNAL # Enable gjournal-based UFS journaling
> options QUOTA # Enable disk quotas for UFS
> options SUIDDIR
> options NFSCL # Network Filesystem Client
> options NFSD # Network Filesystem Server
> options NFSLOCKD # Network Lock Manager
> options MSDOSFS # MSDOS Filesystem
> options CD9660 # ISO 9660 Filesystem
> options FUSEFS
> options NULLFS # NULL filesystem
> options UNIONFS
> options FDESCFS # File descriptor filesystem
> options PROCFS # Process filesystem (requires PSEUDOFS)
> options PSEUDOFS # Pseudo-filesystem framework
> options GEOM_PART_GPT # GUID Partition Tables.
> options GEOM_RAID # Soft RAID functionality.
> options GEOM_LABEL # Provides labelization
> options GEOM_ELI # Disk encryption.
> options COMPAT_FREEBSD32 # Compatible with i386 binaries
> options COMPAT_FREEBSD4 # Compatible with FreeBSD4
> options COMPAT_FREEBSD5 # Compatible with FreeBSD5
> options COMPAT_FREEBSD6 # Compatible with FreeBSD6
> options COMPAT_FREEBSD7 # Compatible with FreeBSD7
> options COMPAT_FREEBSD9 # Compatible with FreeBSD9
> options COMPAT_FREEBSD10 # Compatible with FreeBSD10
> options COMPAT_FREEBSD11 # Compatible with FreeBSD11
> options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
> options KTRACE # ktrace(1) support
> options STACK # stack(9) support
> options SYSVSHM # SYSV-style shared memory
> options SYSVMSG # SYSV-style message queues
> options SYSVSEM # SYSV-style semaphores
> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time
> extensions
> options PRINTF_BUFR_SIZE=128 # Prevent printf output being
> interspersed.
> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
> options AUDIT # Security event auditing
> options CAPABILITY_MODE # Capsicum capability mode
> options CAPABILITIES # Capsicum capabilities
> options MAC # TrustedBSD MAC Framework
> options MAC_PORTACL
> options MAC_NTPD
> options KDTRACE_FRAME # Ensure frames are compiled in
> options KDTRACE_HOOKS # Kernel DTrace hooks
> options DDB_CTF # Kernel ELF linker loads CTF data
> options INCLUDE_CONFIG_FILE # Include this file in kernel
>
> # Debugging support.  Always need this:
> options KDB # Enable kernel debugger support.
> options KDB_TRACE # Print a stack trace for a panic.
> options KDB_UNATTENDED
>
> # Make an SMP-capable kernel by default
> options SMP # Symmetric MultiProcessor Kernel
> options EARLY_AP_STARTUP
>
> # CPU frequency control
> device cpufreq
> device cpuctl
> device coretemp
>
> # Bus support.
> device acpi
> options ACPI_DMAR
> device pci
> options PCI_IOV # PCI SR-IOV support
>
> device iicbus
> device iicbb
>
> device iic
> device ic
> device iicsmb
>
> device ichsmb
> device smbus
> device smb
>
> #device jedec_dimm
>
> # ATA controllers
> device ahci # AHCI-compatible SATA controllers
> device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>
> # SCSI Controllers
> device mps # LSI-Logic MPT-Fusion 2
>
> # ATA/SCSI peripherals
> device scbus # SCSI bus (required for ATA/SCSI)
> device da # Direct Access (disks)
> device cd # CD
> device pass # Passthrough device (direct ATA/SCSI access)
> device ses # Enclosure Services (SES and SAF-TE)
> device sg
>
> device cfiscsi
> device ctl # CAM Target Layer
> device iscsi
>
> # atkbdc0 controls both the keyboard and the PS/2 mouse
> device atkbdc # AT keyboard controller
> device atkbd # AT keyboard
> device psm # PS/2 mouse
>
> device kbdmux # keyboard multiplexer
>
> # vt is the new video console driver
> device vt
> device vt_vga
> device vt_efifb
>
> # Serial (COM) ports
> device uart # Generic UART driver
>
> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
> device iflib
> device em # Intel PRO/1000 Gigabit Ethernet Family
> device ix # Intel PRO/10GbE PCIE PF Ethernet
>
> # Network stack virtualization.
> options VIMAGE
>
> # Pseudo devices.
> device crypto
> device cryptodev
> device loop # Network loopback
> device random # Entropy device
> device padlock_rng # VIA Padlock RNG
> device rdrand_rng # Intel Bull Mountain RNG
> device ipmi
> device smbios
> device vpd
> device aesni # AES-NI OpenCrypto module
> device ether # Ethernet support
> device lagg
> device vlan # 802.1Q VLAN support
> device tuntap # Packet tunnel.
> device md # Memory "disks"
> device gif # IPv6 and IPv4 tunneling
> device firmware # firmware assist module
>
> device pf
> #device pflog
> #device pfsync
>
> # The `bpf' device enables the Berkeley Packet Filter.
> # Be aware of the administrative consequences of enabling this!
> # Note that 'bpf' is required for DHCP.
> device bpf # Berkeley packet filter
>
> # The `epair' device implements a virtual back-to-back connected
> Ethernet
> # like interface pair.
> device epair
>
> # USB support
> options USB_DEBUG # enable debug msgs
> device uhci # UHCI PCI->USB interface
> device ohci # OHCI PCI->USB interface
> device ehci # EHCI PCI->USB interface (USB 2.0)
> device xhci # XHCI PCI->USB interface (USB 3.0)
> device usb # USB Bus (required)
> device uhid
> device ukbd # Keyboard
> device umass # Disks/Mass storage - Requires scbus and da
> device ums
>
> device filemon
>
> device if_bridge
>
>> On 20 Nov 2020, at 12:53, Kristof Provost <[hidden email]> wrote:
>>
>> Can you share your kernel config file (and src.conf / make.conf if
>> they exist)?
>>
>> This second panic is in the IPSec code. My current thinking is that
>> your kernel config is triggering a bug that’s manifesting in
>> multiple places, but not actually caused by those places.
>>
>> I’d like to be able to reproduce it so we can debug it.
>>
>> Best regards,
>> Kristof
>>
>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>> Hi Kristof,
>>>
>>> This is 12-stable. With the previous bridge epochification that was
>>> backed out my config had a panic too.
>>>
>>> I don’t have any local modifications. I did a clean rebuild after
>>> removing /usr/obj/usr
>>>
>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and
>>> nmdm.ko as modules. Everything else is statically linked. I have
>>> removed all drivers not needed for the hardware at hand.
>>>
>>> My bridge is between two vlans from the same trunk and the jail
>>> epair devices as well as the bhyve tap devices.
>>>
>>> The panic happens when the jails are starting.
>>>
>>> I can try to narrow it down over the weekend and make the crash dump
>>> available for analysis.
>>>
>>> Previously I had the following crash with 363492
>>>
>>> kernel trap 12 with interrupts disabled
>>>
>>>
>>> Fatal trap 12: page fault while in kernel mode
>>> cpuid = 2; apic id = 02
>>> fault virtual address = 0xffffffff00000410
>>> fault code = supervisor read data, page not present
>>> instruction pointer = 0x20:0xffffffff80692326
>>> stack pointer        = 0x28:0xfffffe00c06097b0
>>> frame pointer        = 0x28:0xfffffe00c06097f0
>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>> processor eflags = resume, IOPL = 0
>>> current process = 2030 (ifconfig)
>>> trap number = 12
>>> panic: page fault
>>> cpuid = 2
>>> time = 1595683412
>>> KDB: stack backtrace:
>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>> #2 0xffffffff8064d4f3 at panic+0x43
>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>> #5 0xffffffff809cb9b6 at trap+0x286
>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>
>>>
>>>
>>>
>>>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>>>>
>>>> On 20 Nov 2020, at 11:18, [hidden email]
>>>> <mailto:[hidden email]> wrote:
>>>>> I’m afraid the last Epoch fix for bridge is not solving the
>>>>> problem ( or perhaps creates a new ).
>>>>>
>>>> We’re talking about the stable/12 branch, right?
>>>>
>>>>> This seems to happen when the jail epair is added to the bridge.
>>>>>
>>>> There must be something more to it than that. I’ve run the bridge
>>>> tests on stable/12 without issue, and this is a problem we didn’t
>>>> see when the bridge epochification initially went into stable/12.
>>>>
>>>> Do you have a custom kernel config? Other patches? What exact
>>>> commands do you run to trigger the panic?
>>>>
>>>>> kernel trap 12 with interrupts disabled
>>>>>
>>>>>
>>>>> Fatal trap 12: page fault while in kernel mode
>>>>> cpuid = 6; apic id = 06
>>>>> fault virtual address = 0xc10
>>>>> fault code = supervisor read data, page not present
>>>>> instruction pointer = 0x20:0xffffffff80695e76
>>>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>> processor eflags = resume, IOPL = 0
>>>>> current process = 1686 (jail)
>>>>> trap number = 12
>>>>> panic: page fault
>>>>> cpuid = 6
>>>>> time = 1605811310
>>>>> KDB: stack backtrace:
>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>
>>>> This panic is rather odd. This isn’t even the bridge code. This
>>>> is during initial creation of the vnet. I don’t really see how
>>>> this could even trigger panics.
>>>> That panic looks as if something corrupted the net_epoch_preempt,
>>>> by overwriting the epoch->e_epoch. The bridge patches only access
>>>> this variable through the well-established functions and macros. I
>>>> see no obvious way that they could corrupt it.
>>>>
>>>> Best regards,
>>>> Kristof
>>
>>
>> _______________________________________________
>> [hidden email] mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to
>> "[hidden email]"
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Peter Blok
Kristof,

With a GENERIC kernel it does NOT happen. I do have a different iflib related panic at reboot, but I’ll report that separately.

I brought the two config files closer together and found out that if I remove if_bridge from the config file and have it loaded dynamically when the bridge is created, the problem no longer happens and everything works ok.

Peter

> On 20 Nov 2020, at 15:53, Kristof Provost <[hidden email]> wrote:
>
> I still can’t reproduce that panic.
>
> Does it happen immediately after you start a vnet jail?
>
> Does it also happen with a GENERIC kernel?
>
> Regards,
> Kristof
>
> On 20 Nov 2020, at 14:53, Peter Blok wrote:
>
>> The panic with ipsec code in the backtrace was already very strange. I was using IPsec, but only on one interface totally separate from the members of the bridge as well as the bridge itself. The jails were not doing any ipsec as well. Note that panic was a while ago and it was after the 1st bridge epochification was done on stable-12 which was later backed out
>>
>> Today the system is no longer using ipsec, but it is still compiled in. I can remove it if need be for a test
>>
>>
>> src.conf
>> WITHOUT_KERBEROS=yes
>> WITHOUT_GSSAPI=yes
>> WITHOUT_SENDMAIL=true
>> WITHOUT_MAILWRAPPER=true
>> WITHOUT_DMAGENT=true
>> WITHOUT_GAMES=true
>> WITHOUT_IPFILTER=true
>> WITHOUT_UNBOUND=true
>> WITHOUT_PROFILE=true
>> WITHOUT_ATM=true
>> WITHOUT_BSNMP=true
>> #WITHOUT_CROSS_COMPILER=true
>> WITHOUT_DEBUG_FILES=true
>> WITHOUT_DICT=true
>> WITHOUT_FLOPPY=true
>> WITHOUT_HTML=true
>> WITHOUT_HYPERV=true
>> WITHOUT_NDIS=true
>> WITHOUT_NIS=true
>> WITHOUT_PPP=true
>> WITHOUT_TALK=true
>> WITHOUT_TESTS=true
>> WITHOUT_WIRELESS=true
>> #WITHOUT_LIB32=true
>> WITHOUT_LPR=true
>>
>> make.conf
>> KERNCONF=BHYVE
>> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common linuxkpi linprocfs linsysfs ext2fs
>> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
>> OPTIONS_UNSET=DOCS NLS MANPAGES
>>
>> BHYVE
>> cpu HAMMER
>> ident BHYVE
>>
>> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
>> makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support
>>
>> options CAMDEBUG
>>
>> options SCHED_ULE # ULE scheduler
>> options PREEMPTION # Enable kernel thread preemption
>> options INET # InterNETworking
>> options INET6 # IPv6 communications protocols
>> options IPSEC
>> options TCP_OFFLOAD # TCP offload
>> options TCP_RFC7413 # TCP FASTOPEN
>> options SCTP # Stream Control Transmission Protocol
>> options FFS # Berkeley Fast Filesystem
>> options SOFTUPDATES # Enable FFS soft updates support
>> options UFS_ACL # Support for access control lists
>> options UFS_DIRHASH # Improve performance on big directories
>> options UFS_GJOURNAL # Enable gjournal-based UFS journaling
>> options QUOTA # Enable disk quotas for UFS
>> options SUIDDIR
>> options NFSCL # Network Filesystem Client
>> options NFSD # Network Filesystem Server
>> options NFSLOCKD # Network Lock Manager
>> options MSDOSFS # MSDOS Filesystem
>> options CD9660 # ISO 9660 Filesystem
>> options FUSEFS
>> options NULLFS # NULL filesystem
>> options UNIONFS
>> options FDESCFS # File descriptor filesystem
>> options PROCFS # Process filesystem (requires PSEUDOFS)
>> options PSEUDOFS # Pseudo-filesystem framework
>> options GEOM_PART_GPT # GUID Partition Tables.
>> options GEOM_RAID # Soft RAID functionality.
>> options GEOM_LABEL # Provides labelization
>> options GEOM_ELI # Disk encryption.
>> options COMPAT_FREEBSD32 # Compatible with i386 binaries
>> options COMPAT_FREEBSD4 # Compatible with FreeBSD4
>> options COMPAT_FREEBSD5 # Compatible with FreeBSD5
>> options COMPAT_FREEBSD6 # Compatible with FreeBSD6
>> options COMPAT_FREEBSD7 # Compatible with FreeBSD7
>> options COMPAT_FREEBSD9 # Compatible with FreeBSD9
>> options COMPAT_FREEBSD10 # Compatible with FreeBSD10
>> options COMPAT_FREEBSD11 # Compatible with FreeBSD11
>> options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
>> options KTRACE # ktrace(1) support
>> options STACK # stack(9) support
>> options SYSVSHM # SYSV-style shared memory
>> options SYSVMSG # SYSV-style message queues
>> options SYSVSEM # SYSV-style semaphores
>> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
>> options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed.
>> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
>> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
>> options AUDIT # Security event auditing
>> options CAPABILITY_MODE # Capsicum capability mode
>> options CAPABILITIES # Capsicum capabilities
>> options MAC # TrustedBSD MAC Framework
>> options MAC_PORTACL
>> options MAC_NTPD
>> options KDTRACE_FRAME # Ensure frames are compiled in
>> options KDTRACE_HOOKS # Kernel DTrace hooks
>> options DDB_CTF # Kernel ELF linker loads CTF data
>> options INCLUDE_CONFIG_FILE # Include this file in kernel
>>
>> # Debugging support.  Always need this:
>> options KDB # Enable kernel debugger support.
>> options KDB_TRACE # Print a stack trace for a panic.
>> options KDB_UNATTENDED
>>
>> # Make an SMP-capable kernel by default
>> options SMP # Symmetric MultiProcessor Kernel
>> options EARLY_AP_STARTUP
>>
>> # CPU frequency control
>> device cpufreq
>> device cpuctl
>> device coretemp
>>
>> # Bus support.
>> device acpi
>> options ACPI_DMAR
>> device pci
>> options PCI_IOV # PCI SR-IOV support
>>
>> device iicbus
>> device iicbb
>>
>> device iic
>> device ic
>> device iicsmb
>>
>> device ichsmb
>> device smbus
>> device smb
>>
>> #device jedec_dimm
>>
>> # ATA controllers
>> device ahci # AHCI-compatible SATA controllers
>> device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>>
>> # SCSI Controllers
>> device mps # LSI-Logic MPT-Fusion 2
>>
>> # ATA/SCSI peripherals
>> device scbus # SCSI bus (required for ATA/SCSI)
>> device da # Direct Access (disks)
>> device cd # CD
>> device pass # Passthrough device (direct ATA/SCSI access)
>> device ses # Enclosure Services (SES and SAF-TE)
>> device sg
>>
>> device cfiscsi
>> device ctl # CAM Target Layer
>> device iscsi
>>
>> # atkbdc0 controls both the keyboard and the PS/2 mouse
>> device atkbdc # AT keyboard controller
>> device atkbd # AT keyboard
>> device psm # PS/2 mouse
>>
>> device kbdmux # keyboard multiplexer
>>
>> # vt is the new video console driver
>> device vt
>> device vt_vga
>> device vt_efifb
>>
>> # Serial (COM) ports
>> device uart # Generic UART driver
>>
>> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
>> device iflib
>> device em # Intel PRO/1000 Gigabit Ethernet Family
>> device ix # Intel PRO/10GbE PCIE PF Ethernet
>>
>> # Network stack virtualization.
>> options VIMAGE
>>
>> # Pseudo devices.
>> device crypto
>> device cryptodev
>> device loop # Network loopback
>> device random # Entropy device
>> device padlock_rng # VIA Padlock RNG
>> device rdrand_rng # Intel Bull Mountain RNG
>> device ipmi
>> device smbios
>> device vpd
>> device aesni # AES-NI OpenCrypto module
>> device ether # Ethernet support
>> device lagg
>> device vlan # 802.1Q VLAN support
>> device tuntap # Packet tunnel.
>> device md # Memory "disks"
>> device gif # IPv6 and IPv4 tunneling
>> device firmware # firmware assist module
>>
>> device pf
>> #device pflog
>> #device pfsync
>>
>> # The `bpf' device enables the Berkeley Packet Filter.
>> # Be aware of the administrative consequences of enabling this!
>> # Note that 'bpf' is required for DHCP.
>> device bpf # Berkeley packet filter
>>
>> # The `epair' device implements a virtual back-to-back connected Ethernet
>> # like interface pair.
>> device epair
>>
>> # USB support
>> options USB_DEBUG # enable debug msgs
>> device uhci # UHCI PCI->USB interface
>> device ohci # OHCI PCI->USB interface
>> device ehci # EHCI PCI->USB interface (USB 2.0)
>> device xhci # XHCI PCI->USB interface (USB 3.0)
>> device usb # USB Bus (required)
>> device uhid
>> device ukbd # Keyboard
>> device umass # Disks/Mass storage - Requires scbus and da
>> device ums
>>
>> device filemon
>>
>> device if_bridge
>>
>>> On 20 Nov 2020, at 12:53, Kristof Provost <[hidden email]> wrote:
>>>
>>> Can you share your kernel config file (and src.conf / make.conf if they exist)?
>>>
>>> This second panic is in the IPSec code. My current thinking is that your kernel config is triggering a bug that’s manifesting in multiple places, but not actually caused by those places.
>>>
>>> I’d like to be able to reproduce it so we can debug it.
>>>
>>> Best regards,
>>> Kristof
>>>
>>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>>> Hi Kristof,
>>>>
>>>> This is 12-stable. With the previous bridge epochification that was backed out my config had a panic too.
>>>>
>>>> I don’t have any local modifications. I did a clean rebuild after removing /usr/obj/usr
>>>>
>>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and nmdm.ko as modules. Everything else is statically linked. I have removed all drivers not needed for the hardware at hand.
>>>>
>>>> My bridge is between two vlans from the same trunk and the jail epair devices as well as the bhyve tap devices.
>>>>
>>>> The panic happens when the jails are starting.
>>>>
>>>> I can try to narrow it down over the weekend and make the crash dump available for analysis.
>>>>
>>>> Previously I had the following crash with 363492
>>>>
>>>> kernel trap 12 with interrupts disabled
>>>>
>>>>
>>>> Fatal trap 12: page fault while in kernel mode
>>>> cpuid = 2; apic id = 02
>>>> fault virtual address = 0xffffffff00000410
>>>> fault code = supervisor read data, page not present
>>>> instruction pointer = 0x20:0xffffffff80692326
>>>> stack pointer        = 0x28:0xfffffe00c06097b0
>>>> frame pointer        = 0x28:0xfffffe00c06097f0
>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>> processor eflags = resume, IOPL = 0
>>>> current process = 2030 (ifconfig)
>>>> trap number = 12
>>>> panic: page fault
>>>> cpuid = 2
>>>> time = 1595683412
>>>> KDB: stack backtrace:
>>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>>> #2 0xffffffff8064d4f3 at panic+0x43
>>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>>> #5 0xffffffff809cb9b6 at trap+0x286
>>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>>
>>>>
>>>>
>>>>
>>>>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>>>>>
>>>>> On 20 Nov 2020, at 11:18, [hidden email] <mailto:[hidden email]> wrote:
>>>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or perhaps creates a new ).
>>>>>>
>>>>> We’re talking about the stable/12 branch, right?
>>>>>
>>>>>> This seems to happen when the jail epair is added to the bridge.
>>>>>>
>>>>> There must be something more to it than that. I’ve run the bridge tests on stable/12 without issue, and this is a problem we didn’t see when the bridge epochification initially went into stable/12.
>>>>>
>>>>> Do you have a custom kernel config? Other patches? What exact commands do you run to trigger the panic?
>>>>>
>>>>>> kernel trap 12 with interrupts disabled
>>>>>>
>>>>>>
>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>> cpuid = 6; apic id = 06
>>>>>> fault virtual address = 0xc10
>>>>>> fault code = supervisor read data, page not present
>>>>>> instruction pointer = 0x20:0xffffffff80695e76
>>>>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>>>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>> processor eflags = resume, IOPL = 0
>>>>>> current process = 1686 (jail)
>>>>>> trap number = 12
>>>>>> panic: page fault
>>>>>> cpuid = 6
>>>>>> time = 1605811310
>>>>>> KDB: stack backtrace:
>>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>>
>>>>> This panic is rather odd. This isn’t even the bridge code. This is during initial creation of the vnet. I don’t really see how this could even trigger panics.
>>>>> That panic looks as if something corrupted the net_epoch_preempt, by overwriting the epoch->e_epoch. The bridge patches only access this variable through the well-established functions and macros. I see no obvious way that they could corrupt it.
>>>>>
>>>>> Best regards,
>>>>> Kristof
>>>
>>>
>>> _______________________________________________
>>> [hidden email] mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to "[hidden email]"
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Peter Blok
Kristof,

With commit 367705+367706 and if_bridge statically linked. It crashes while adding an epair of a jail.

With commit 367705+367706 and if_bridge dynamically loaded there is a crash at reboot

#0 0xffffffff8069ddc5 at kdb_backtrace+0x65
#1 0xffffffff80652c8b at vpanic+0x17b
#2 0xffffffff80652b03 at panic+0x43
#3 0xffffffff809c8951 at trap_fatal+0x391
#4 0xffffffff809c89af at trap_pfault+0x4f
#5 0xffffffff809c7ff6 at trap+0x286
#6 0xffffffff809a1ec8 at calltrap+0x8
#7 0xffffffff8079f7ed at ip_input+0x63d
#8 0xffffffff8077a07a at netisr_dispatch_src+0xca
#9 0xffffffff8075a6f8 at ether_demux+0x138
#10 0xffffffff8075b9bb at ether_nh_input+0x33b
#11 0xffffffff8077a07a at netisr_dispatch_src+0xca
#12 0xffffffff8075ab1b at ether_input+0x4b
#13 0xffffffff8077a80b at swi_net+0x12b
#14 0xffffffff8061e10c at ithread_loop+0x23c
#15 0xffffffff8061afbe at fork_exit+0x7e
#16 0xffffffff809a2efe at fork_trampoline+0xe

Peter

> On 21 Nov 2020, at 17:22, Peter Blok <[hidden email]> wrote:
>
> Kristof,
>
> With a GENERIC kernel it does NOT happen. I do have a different iflib related panic at reboot, but I’ll report that separately.
>
> I brought the two config files closer together and found out that if I remove if_bridge from the config file and have it loaded dynamically when the bridge is created, the problem no longer happens and everything works ok.
>
> Peter
>
>> On 20 Nov 2020, at 15:53, Kristof Provost <[hidden email]> wrote:
>>
>> I still can’t reproduce that panic.
>>
>> Does it happen immediately after you start a vnet jail?
>>
>> Does it also happen with a GENERIC kernel?
>>
>> Regards,
>> Kristof
>>
>> On 20 Nov 2020, at 14:53, Peter Blok wrote:
>>
>>> The panic with ipsec code in the backtrace was already very strange. I was using IPsec, but only on one interface totally separate from the members of the bridge as well as the bridge itself. The jails were not doing any ipsec as well. Note that panic was a while ago and it was after the 1st bridge epochification was done on stable-12 which was later backed out
>>>
>>> Today the system is no longer using ipsec, but it is still compiled in. I can remove it if need be for a test
>>>
>>>
>>> src.conf
>>> WITHOUT_KERBEROS=yes
>>> WITHOUT_GSSAPI=yes
>>> WITHOUT_SENDMAIL=true
>>> WITHOUT_MAILWRAPPER=true
>>> WITHOUT_DMAGENT=true
>>> WITHOUT_GAMES=true
>>> WITHOUT_IPFILTER=true
>>> WITHOUT_UNBOUND=true
>>> WITHOUT_PROFILE=true
>>> WITHOUT_ATM=true
>>> WITHOUT_BSNMP=true
>>> #WITHOUT_CROSS_COMPILER=true
>>> WITHOUT_DEBUG_FILES=true
>>> WITHOUT_DICT=true
>>> WITHOUT_FLOPPY=true
>>> WITHOUT_HTML=true
>>> WITHOUT_HYPERV=true
>>> WITHOUT_NDIS=true
>>> WITHOUT_NIS=true
>>> WITHOUT_PPP=true
>>> WITHOUT_TALK=true
>>> WITHOUT_TESTS=true
>>> WITHOUT_WIRELESS=true
>>> #WITHOUT_LIB32=true
>>> WITHOUT_LPR=true
>>>
>>> make.conf
>>> KERNCONF=BHYVE
>>> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common linuxkpi linprocfs linsysfs ext2fs
>>> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
>>> OPTIONS_UNSET=DOCS NLS MANPAGES
>>>
>>> BHYVE
>>> cpu HAMMER
>>> ident BHYVE
>>>
>>> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
>>> makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support
>>>
>>> options CAMDEBUG
>>>
>>> options SCHED_ULE # ULE scheduler
>>> options PREEMPTION # Enable kernel thread preemption
>>> options INET # InterNETworking
>>> options INET6 # IPv6 communications protocols
>>> options IPSEC
>>> options TCP_OFFLOAD # TCP offload
>>> options TCP_RFC7413 # TCP FASTOPEN
>>> options SCTP # Stream Control Transmission Protocol
>>> options FFS # Berkeley Fast Filesystem
>>> options SOFTUPDATES # Enable FFS soft updates support
>>> options UFS_ACL # Support for access control lists
>>> options UFS_DIRHASH # Improve performance on big directories
>>> options UFS_GJOURNAL # Enable gjournal-based UFS journaling
>>> options QUOTA # Enable disk quotas for UFS
>>> options SUIDDIR
>>> options NFSCL # Network Filesystem Client
>>> options NFSD # Network Filesystem Server
>>> options NFSLOCKD # Network Lock Manager
>>> options MSDOSFS # MSDOS Filesystem
>>> options CD9660 # ISO 9660 Filesystem
>>> options FUSEFS
>>> options NULLFS # NULL filesystem
>>> options UNIONFS
>>> options FDESCFS # File descriptor filesystem
>>> options PROCFS # Process filesystem (requires PSEUDOFS)
>>> options PSEUDOFS # Pseudo-filesystem framework
>>> options GEOM_PART_GPT # GUID Partition Tables.
>>> options GEOM_RAID # Soft RAID functionality.
>>> options GEOM_LABEL # Provides labelization
>>> options GEOM_ELI # Disk encryption.
>>> options COMPAT_FREEBSD32 # Compatible with i386 binaries
>>> options COMPAT_FREEBSD4 # Compatible with FreeBSD4
>>> options COMPAT_FREEBSD5 # Compatible with FreeBSD5
>>> options COMPAT_FREEBSD6 # Compatible with FreeBSD6
>>> options COMPAT_FREEBSD7 # Compatible with FreeBSD7
>>> options COMPAT_FREEBSD9 # Compatible with FreeBSD9
>>> options COMPAT_FREEBSD10 # Compatible with FreeBSD10
>>> options COMPAT_FREEBSD11 # Compatible with FreeBSD11
>>> options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
>>> options KTRACE # ktrace(1) support
>>> options STACK # stack(9) support
>>> options SYSVSHM # SYSV-style shared memory
>>> options SYSVMSG # SYSV-style message queues
>>> options SYSVSEM # SYSV-style semaphores
>>> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
>>> options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed.
>>> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
>>> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
>>> options AUDIT # Security event auditing
>>> options CAPABILITY_MODE # Capsicum capability mode
>>> options CAPABILITIES # Capsicum capabilities
>>> options MAC # TrustedBSD MAC Framework
>>> options MAC_PORTACL
>>> options MAC_NTPD
>>> options KDTRACE_FRAME # Ensure frames are compiled in
>>> options KDTRACE_HOOKS # Kernel DTrace hooks
>>> options DDB_CTF # Kernel ELF linker loads CTF data
>>> options INCLUDE_CONFIG_FILE # Include this file in kernel
>>>
>>> # Debugging support.  Always need this:
>>> options KDB # Enable kernel debugger support.
>>> options KDB_TRACE # Print a stack trace for a panic.
>>> options KDB_UNATTENDED
>>>
>>> # Make an SMP-capable kernel by default
>>> options SMP # Symmetric MultiProcessor Kernel
>>> options EARLY_AP_STARTUP
>>>
>>> # CPU frequency control
>>> device cpufreq
>>> device cpuctl
>>> device coretemp
>>>
>>> # Bus support.
>>> device acpi
>>> options ACPI_DMAR
>>> device pci
>>> options PCI_IOV # PCI SR-IOV support
>>>
>>> device iicbus
>>> device iicbb
>>>
>>> device iic
>>> device ic
>>> device iicsmb
>>>
>>> device ichsmb
>>> device smbus
>>> device smb
>>>
>>> #device jedec_dimm
>>>
>>> # ATA controllers
>>> device ahci # AHCI-compatible SATA controllers
>>> device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>>>
>>> # SCSI Controllers
>>> device mps # LSI-Logic MPT-Fusion 2
>>>
>>> # ATA/SCSI peripherals
>>> device scbus # SCSI bus (required for ATA/SCSI)
>>> device da # Direct Access (disks)
>>> device cd # CD
>>> device pass # Passthrough device (direct ATA/SCSI access)
>>> device ses # Enclosure Services (SES and SAF-TE)
>>> device sg
>>>
>>> device cfiscsi
>>> device ctl # CAM Target Layer
>>> device iscsi
>>>
>>> # atkbdc0 controls both the keyboard and the PS/2 mouse
>>> device atkbdc # AT keyboard controller
>>> device atkbd # AT keyboard
>>> device psm # PS/2 mouse
>>>
>>> device kbdmux # keyboard multiplexer
>>>
>>> # vt is the new video console driver
>>> device vt
>>> device vt_vga
>>> device vt_efifb
>>>
>>> # Serial (COM) ports
>>> device uart # Generic UART driver
>>>
>>> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
>>> device iflib
>>> device em # Intel PRO/1000 Gigabit Ethernet Family
>>> device ix # Intel PRO/10GbE PCIE PF Ethernet
>>>
>>> # Network stack virtualization.
>>> options VIMAGE
>>>
>>> # Pseudo devices.
>>> device crypto
>>> device cryptodev
>>> device loop # Network loopback
>>> device random # Entropy device
>>> device padlock_rng # VIA Padlock RNG
>>> device rdrand_rng # Intel Bull Mountain RNG
>>> device ipmi
>>> device smbios
>>> device vpd
>>> device aesni # AES-NI OpenCrypto module
>>> device ether # Ethernet support
>>> device lagg
>>> device vlan # 802.1Q VLAN support
>>> device tuntap # Packet tunnel.
>>> device md # Memory "disks"
>>> device gif # IPv6 and IPv4 tunneling
>>> device firmware # firmware assist module
>>>
>>> device pf
>>> #device pflog
>>> #device pfsync
>>>
>>> # The `bpf' device enables the Berkeley Packet Filter.
>>> # Be aware of the administrative consequences of enabling this!
>>> # Note that 'bpf' is required for DHCP.
>>> device bpf # Berkeley packet filter
>>>
>>> # The `epair' device implements a virtual back-to-back connected Ethernet
>>> # like interface pair.
>>> device epair
>>>
>>> # USB support
>>> options USB_DEBUG # enable debug msgs
>>> device uhci # UHCI PCI->USB interface
>>> device ohci # OHCI PCI->USB interface
>>> device ehci # EHCI PCI->USB interface (USB 2.0)
>>> device xhci # XHCI PCI->USB interface (USB 3.0)
>>> device usb # USB Bus (required)
>>> device uhid
>>> device ukbd # Keyboard
>>> device umass # Disks/Mass storage - Requires scbus and da
>>> device ums
>>>
>>> device filemon
>>>
>>> device if_bridge
>>>
>>>> On 20 Nov 2020, at 12:53, Kristof Provost <[hidden email]> wrote:
>>>>
>>>> Can you share your kernel config file (and src.conf / make.conf if they exist)?
>>>>
>>>> This second panic is in the IPSec code. My current thinking is that your kernel config is triggering a bug that’s manifesting in multiple places, but not actually caused by those places.
>>>>
>>>> I’d like to be able to reproduce it so we can debug it.
>>>>
>>>> Best regards,
>>>> Kristof
>>>>
>>>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>>>> Hi Kristof,
>>>>>
>>>>> This is 12-stable. With the previous bridge epochification that was backed out my config had a panic too.
>>>>>
>>>>> I don’t have any local modifications. I did a clean rebuild after removing /usr/obj/usr
>>>>>
>>>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and nmdm.ko as modules. Everything else is statically linked. I have removed all drivers not needed for the hardware at hand.
>>>>>
>>>>> My bridge is between two vlans from the same trunk and the jail epair devices as well as the bhyve tap devices.
>>>>>
>>>>> The panic happens when the jails are starting.
>>>>>
>>>>> I can try to narrow it down over the weekend and make the crash dump available for analysis.
>>>>>
>>>>> Previously I had the following crash with 363492
>>>>>
>>>>> kernel trap 12 with interrupts disabled
>>>>>
>>>>>
>>>>> Fatal trap 12: page fault while in kernel mode
>>>>> cpuid = 2; apic id = 02
>>>>> fault virtual address = 0xffffffff00000410
>>>>> fault code = supervisor read data, page not present
>>>>> instruction pointer = 0x20:0xffffffff80692326
>>>>> stack pointer        = 0x28:0xfffffe00c06097b0
>>>>> frame pointer        = 0x28:0xfffffe00c06097f0
>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>> processor eflags = resume, IOPL = 0
>>>>> current process = 2030 (ifconfig)
>>>>> trap number = 12
>>>>> panic: page fault
>>>>> cpuid = 2
>>>>> time = 1595683412
>>>>> KDB: stack backtrace:
>>>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>>>> #2 0xffffffff8064d4f3 at panic+0x43
>>>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>>>> #5 0xffffffff809cb9b6 at trap+0x286
>>>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>>>>>>
>>>>>> On 20 Nov 2020, at 11:18, [hidden email] <mailto:[hidden email]> wrote:
>>>>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or perhaps creates a new ).
>>>>>>>
>>>>>> We’re talking about the stable/12 branch, right?
>>>>>>
>>>>>>> This seems to happen when the jail epair is added to the bridge.
>>>>>>>
>>>>>> There must be something more to it than that. I’ve run the bridge tests on stable/12 without issue, and this is a problem we didn’t see when the bridge epochification initially went into stable/12.
>>>>>>
>>>>>> Do you have a custom kernel config? Other patches? What exact commands do you run to trigger the panic?
>>>>>>
>>>>>>> kernel trap 12 with interrupts disabled
>>>>>>>
>>>>>>>
>>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>>> cpuid = 6; apic id = 06
>>>>>>> fault virtual address = 0xc10
>>>>>>> fault code = supervisor read data, page not present
>>>>>>> instruction pointer = 0x20:0xffffffff80695e76
>>>>>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>>>>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>>> processor eflags = resume, IOPL = 0
>>>>>>> current process = 1686 (jail)
>>>>>>> trap number = 12
>>>>>>> panic: page fault
>>>>>>> cpuid = 6
>>>>>>> time = 1605811310
>>>>>>> KDB: stack backtrace:
>>>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>>>
>>>>>> This panic is rather odd. This isn’t even the bridge code. This is during initial creation of the vnet. I don’t really see how this could even trigger panics.
>>>>>> That panic looks as if something corrupted the net_epoch_preempt, by overwriting the epoch->e_epoch. The bridge patches only access this variable through the well-established functions and macros. I see no obvious way that they could corrupt it.
>>>>>>
>>>>>> Best regards,
>>>>>> Kristof
>>>>
>>>>
>>>> _______________________________________________
>>>> [hidden email] mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>> To unsubscribe, send any mail to "[hidden email]"
>> _______________________________________________
>> [hidden email] mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>> To unsubscribe, send any mail to "[hidden email]"
>


smime.p7s (3K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Kristof Provost
Peter,

Is that backtrace from the first or the second situation you describe?
What kernel config are you using with that backtrace?

This backtrace does not appear to involve the bridge. Given that part of
the panic message is cut off it’s very hard to conclude anything at
all from it.

Best regards,
Kristof

On 23 Nov 2020, at 11:52, Peter Blok wrote:

> Kristof,
>
> With commit 367705+367706 and if_bridge statically linked. It crashes
> while adding an epair of a jail.
>
> With commit 367705+367706 and if_bridge dynamically loaded there is a
> crash at reboot
>
> #0 0xffffffff8069ddc5 at kdb_backtrace+0x65
> #1 0xffffffff80652c8b at vpanic+0x17b
> #2 0xffffffff80652b03 at panic+0x43
> #3 0xffffffff809c8951 at trap_fatal+0x391
> #4 0xffffffff809c89af at trap_pfault+0x4f
> #5 0xffffffff809c7ff6 at trap+0x286
> #6 0xffffffff809a1ec8 at calltrap+0x8
> #7 0xffffffff8079f7ed at ip_input+0x63d
> #8 0xffffffff8077a07a at netisr_dispatch_src+0xca
> #9 0xffffffff8075a6f8 at ether_demux+0x138
> #10 0xffffffff8075b9bb at ether_nh_input+0x33b
> #11 0xffffffff8077a07a at netisr_dispatch_src+0xca
> #12 0xffffffff8075ab1b at ether_input+0x4b
> #13 0xffffffff8077a80b at swi_net+0x12b
> #14 0xffffffff8061e10c at ithread_loop+0x23c
> #15 0xffffffff8061afbe at fork_exit+0x7e
> #16 0xffffffff809a2efe at fork_trampoline+0xe
>
> Peter
>
>> On 21 Nov 2020, at 17:22, Peter Blok <[hidden email]> wrote:
>>
>> Kristof,
>>
>> With a GENERIC kernel it does NOT happen. I do have a different iflib
>> related panic at reboot, but I’ll report that separately.
>>
>> I brought the two config files closer together and found out that if
>> I remove if_bridge from the config file and have it loaded
>> dynamically when the bridge is created, the problem no longer happens
>> and everything works ok.
>>
>> Peter
>>
>>> On 20 Nov 2020, at 15:53, Kristof Provost <[hidden email]> wrote:
>>>
>>> I still can’t reproduce that panic.
>>>
>>> Does it happen immediately after you start a vnet jail?
>>>
>>> Does it also happen with a GENERIC kernel?
>>>
>>> Regards,
>>> Kristof
>>>
>>> On 20 Nov 2020, at 14:53, Peter Blok wrote:
>>>
>>>> The panic with ipsec code in the backtrace was already very
>>>> strange. I was using IPsec, but only on one interface totally
>>>> separate from the members of the bridge as well as the bridge
>>>> itself. The jails were not doing any ipsec as well. Note that panic
>>>> was a while ago and it was after the 1st bridge epochification was
>>>> done on stable-12 which was later backed out
>>>>
>>>> Today the system is no longer using ipsec, but it is still compiled
>>>> in. I can remove it if need be for a test
>>>>
>>>>
>>>> src.conf
>>>> WITHOUT_KERBEROS=yes
>>>> WITHOUT_GSSAPI=yes
>>>> WITHOUT_SENDMAIL=true
>>>> WITHOUT_MAILWRAPPER=true
>>>> WITHOUT_DMAGENT=true
>>>> WITHOUT_GAMES=true
>>>> WITHOUT_IPFILTER=true
>>>> WITHOUT_UNBOUND=true
>>>> WITHOUT_PROFILE=true
>>>> WITHOUT_ATM=true
>>>> WITHOUT_BSNMP=true
>>>> #WITHOUT_CROSS_COMPILER=true
>>>> WITHOUT_DEBUG_FILES=true
>>>> WITHOUT_DICT=true
>>>> WITHOUT_FLOPPY=true
>>>> WITHOUT_HTML=true
>>>> WITHOUT_HYPERV=true
>>>> WITHOUT_NDIS=true
>>>> WITHOUT_NIS=true
>>>> WITHOUT_PPP=true
>>>> WITHOUT_TALK=true
>>>> WITHOUT_TESTS=true
>>>> WITHOUT_WIRELESS=true
>>>> #WITHOUT_LIB32=true
>>>> WITHOUT_LPR=true
>>>>
>>>> make.conf
>>>> KERNCONF=BHYVE
>>>> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge
>>>> bridgestp if_vxlan pflog libmchain libiconv smbfs linux linux64
>>>> linux_common linuxkpi linprocfs linsysfs ext2fs
>>>> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
>>>> OPTIONS_UNSET=DOCS NLS MANPAGES
>>>>
>>>> BHYVE
>>>> cpu HAMMER
>>>> ident BHYVE
>>>>
>>>> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
>>>> makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support
>>>>
>>>> options CAMDEBUG
>>>>
>>>> options SCHED_ULE # ULE scheduler
>>>> options PREEMPTION # Enable kernel thread preemption
>>>> options INET # InterNETworking
>>>> options INET6 # IPv6 communications protocols
>>>> options IPSEC
>>>> options TCP_OFFLOAD # TCP offload
>>>> options TCP_RFC7413 # TCP FASTOPEN
>>>> options SCTP # Stream Control Transmission Protocol
>>>> options FFS # Berkeley Fast Filesystem
>>>> options SOFTUPDATES # Enable FFS soft updates support
>>>> options UFS_ACL # Support for access control lists
>>>> options UFS_DIRHASH # Improve performance on big directories
>>>> options UFS_GJOURNAL # Enable gjournal-based UFS journaling
>>>> options QUOTA # Enable disk quotas for UFS
>>>> options SUIDDIR
>>>> options NFSCL # Network Filesystem Client
>>>> options NFSD # Network Filesystem Server
>>>> options NFSLOCKD # Network Lock Manager
>>>> options MSDOSFS # MSDOS Filesystem
>>>> options CD9660 # ISO 9660 Filesystem
>>>> options FUSEFS
>>>> options NULLFS # NULL filesystem
>>>> options UNIONFS
>>>> options FDESCFS # File descriptor filesystem
>>>> options PROCFS # Process filesystem (requires PSEUDOFS)
>>>> options PSEUDOFS # Pseudo-filesystem framework
>>>> options GEOM_PART_GPT # GUID Partition Tables.
>>>> options GEOM_RAID # Soft RAID functionality.
>>>> options GEOM_LABEL # Provides labelization
>>>> options GEOM_ELI # Disk encryption.
>>>> options COMPAT_FREEBSD32 # Compatible with i386 binaries
>>>> options COMPAT_FREEBSD4 # Compatible with FreeBSD4
>>>> options COMPAT_FREEBSD5 # Compatible with FreeBSD5
>>>> options COMPAT_FREEBSD6 # Compatible with FreeBSD6
>>>> options COMPAT_FREEBSD7 # Compatible with FreeBSD7
>>>> options COMPAT_FREEBSD9 # Compatible with FreeBSD9
>>>> options COMPAT_FREEBSD10 # Compatible with FreeBSD10
>>>> options COMPAT_FREEBSD11 # Compatible with FreeBSD11
>>>> options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
>>>> options KTRACE # ktrace(1) support
>>>> options STACK # stack(9) support
>>>> options SYSVSHM # SYSV-style shared memory
>>>> options SYSVMSG # SYSV-style message queues
>>>> options SYSVSEM # SYSV-style semaphores
>>>> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time
>>>> extensions
>>>> options PRINTF_BUFR_SIZE=128 # Prevent printf output being
>>>> interspersed.
>>>> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
>>>> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
>>>> options AUDIT # Security event auditing
>>>> options CAPABILITY_MODE # Capsicum capability mode
>>>> options CAPABILITIES # Capsicum capabilities
>>>> options MAC # TrustedBSD MAC Framework
>>>> options MAC_PORTACL
>>>> options MAC_NTPD
>>>> options KDTRACE_FRAME # Ensure frames are compiled in
>>>> options KDTRACE_HOOKS # Kernel DTrace hooks
>>>> options DDB_CTF # Kernel ELF linker loads CTF data
>>>> options INCLUDE_CONFIG_FILE # Include this file in kernel
>>>>
>>>> # Debugging support.  Always need this:
>>>> options KDB # Enable kernel debugger support.
>>>> options KDB_TRACE # Print a stack trace for a panic.
>>>> options KDB_UNATTENDED
>>>>
>>>> # Make an SMP-capable kernel by default
>>>> options SMP # Symmetric MultiProcessor Kernel
>>>> options EARLY_AP_STARTUP
>>>>
>>>> # CPU frequency control
>>>> device cpufreq
>>>> device cpuctl
>>>> device coretemp
>>>>
>>>> # Bus support.
>>>> device acpi
>>>> options ACPI_DMAR
>>>> device pci
>>>> options PCI_IOV # PCI SR-IOV support
>>>>
>>>> device iicbus
>>>> device iicbb
>>>>
>>>> device iic
>>>> device ic
>>>> device iicsmb
>>>>
>>>> device ichsmb
>>>> device smbus
>>>> device smb
>>>>
>>>> #device jedec_dimm
>>>>
>>>> # ATA controllers
>>>> device ahci # AHCI-compatible SATA controllers
>>>> device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>>>>
>>>> # SCSI Controllers
>>>> device mps # LSI-Logic MPT-Fusion 2
>>>>
>>>> # ATA/SCSI peripherals
>>>> device scbus # SCSI bus (required for ATA/SCSI)
>>>> device da # Direct Access (disks)
>>>> device cd # CD
>>>> device pass # Passthrough device (direct ATA/SCSI access)
>>>> device ses # Enclosure Services (SES and SAF-TE)
>>>> device sg
>>>>
>>>> device cfiscsi
>>>> device ctl # CAM Target Layer
>>>> device iscsi
>>>>
>>>> # atkbdc0 controls both the keyboard and the PS/2 mouse
>>>> device atkbdc # AT keyboard controller
>>>> device atkbd # AT keyboard
>>>> device psm # PS/2 mouse
>>>>
>>>> device kbdmux # keyboard multiplexer
>>>>
>>>> # vt is the new video console driver
>>>> device vt
>>>> device vt_vga
>>>> device vt_efifb
>>>>
>>>> # Serial (COM) ports
>>>> device uart # Generic UART driver
>>>>
>>>> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
>>>> device iflib
>>>> device em # Intel PRO/1000 Gigabit Ethernet Family
>>>> device ix # Intel PRO/10GbE PCIE PF Ethernet
>>>>
>>>> # Network stack virtualization.
>>>> options VIMAGE
>>>>
>>>> # Pseudo devices.
>>>> device crypto
>>>> device cryptodev
>>>> device loop # Network loopback
>>>> device random # Entropy device
>>>> device padlock_rng # VIA Padlock RNG
>>>> device rdrand_rng # Intel Bull Mountain RNG
>>>> device ipmi
>>>> device smbios
>>>> device vpd
>>>> device aesni # AES-NI OpenCrypto module
>>>> device ether # Ethernet support
>>>> device lagg
>>>> device vlan # 802.1Q VLAN support
>>>> device tuntap # Packet tunnel.
>>>> device md # Memory "disks"
>>>> device gif # IPv6 and IPv4 tunneling
>>>> device firmware # firmware assist module
>>>>
>>>> device pf
>>>> #device pflog
>>>> #device pfsync
>>>>
>>>> # The `bpf' device enables the Berkeley Packet Filter.
>>>> # Be aware of the administrative consequences of enabling this!
>>>> # Note that 'bpf' is required for DHCP.
>>>> device bpf # Berkeley packet filter
>>>>
>>>> # The `epair' device implements a virtual back-to-back connected
>>>> Ethernet
>>>> # like interface pair.
>>>> device epair
>>>>
>>>> # USB support
>>>> options USB_DEBUG # enable debug msgs
>>>> device uhci # UHCI PCI->USB interface
>>>> device ohci # OHCI PCI->USB interface
>>>> device ehci # EHCI PCI->USB interface (USB 2.0)
>>>> device xhci # XHCI PCI->USB interface (USB 3.0)
>>>> device usb # USB Bus (required)
>>>> device uhid
>>>> device ukbd # Keyboard
>>>> device umass # Disks/Mass storage - Requires scbus and da
>>>> device ums
>>>>
>>>> device filemon
>>>>
>>>> device if_bridge
>>>>
>>>>> On 20 Nov 2020, at 12:53, Kristof Provost <[hidden email]> wrote:
>>>>>
>>>>> Can you share your kernel config file (and src.conf / make.conf if
>>>>> they exist)?
>>>>>
>>>>> This second panic is in the IPSec code. My current thinking is
>>>>> that your kernel config is triggering a bug that’s manifesting
>>>>> in multiple places, but not actually caused by those places.
>>>>>
>>>>> I’d like to be able to reproduce it so we can debug it.
>>>>>
>>>>> Best regards,
>>>>> Kristof
>>>>>
>>>>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>>>>> Hi Kristof,
>>>>>>
>>>>>> This is 12-stable. With the previous bridge epochification that
>>>>>> was backed out my config had a panic too.
>>>>>>
>>>>>> I don’t have any local modifications. I did a clean rebuild
>>>>>> after removing /usr/obj/usr
>>>>>>
>>>>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko
>>>>>> and nmdm.ko as modules. Everything else is statically linked. I
>>>>>> have removed all drivers not needed for the hardware at hand.
>>>>>>
>>>>>> My bridge is between two vlans from the same trunk and the jail
>>>>>> epair devices as well as the bhyve tap devices.
>>>>>>
>>>>>> The panic happens when the jails are starting.
>>>>>>
>>>>>> I can try to narrow it down over the weekend and make the crash
>>>>>> dump available for analysis.
>>>>>>
>>>>>> Previously I had the following crash with 363492
>>>>>>
>>>>>> kernel trap 12 with interrupts disabled
>>>>>>
>>>>>>
>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>> cpuid = 2; apic id = 02
>>>>>> fault virtual address = 0xffffffff00000410
>>>>>> fault code = supervisor read data, page not present
>>>>>> instruction pointer = 0x20:0xffffffff80692326
>>>>>> stack pointer        = 0x28:0xfffffe00c06097b0
>>>>>> frame pointer        = 0x28:0xfffffe00c06097f0
>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>> processor eflags = resume, IOPL = 0
>>>>>> current process = 2030 (ifconfig)
>>>>>> trap number = 12
>>>>>> panic: page fault
>>>>>> cpuid = 2
>>>>>> time = 1595683412
>>>>>> KDB: stack backtrace:
>>>>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>>>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>>>>> #2 0xffffffff8064d4f3 at panic+0x43
>>>>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>>>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>>>>> #5 0xffffffff809cb9b6 at trap+0x286
>>>>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>>>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>>>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>>>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>>>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>>>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>>>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>>>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>>>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]>
>>>>>>> wrote:
>>>>>>>
>>>>>>> On 20 Nov 2020, at 11:18, [hidden email]
>>>>>>> <mailto:[hidden email]> wrote:
>>>>>>>> I’m afraid the last Epoch fix for bridge is not solving the
>>>>>>>> problem ( or perhaps creates a new ).
>>>>>>>>
>>>>>>> We’re talking about the stable/12 branch, right?
>>>>>>>
>>>>>>>> This seems to happen when the jail epair is added to the
>>>>>>>> bridge.
>>>>>>>>
>>>>>>> There must be something more to it than that. I’ve run the
>>>>>>> bridge tests on stable/12 without issue, and this is a problem
>>>>>>> we didn’t see when the bridge epochification initially went
>>>>>>> into stable/12.
>>>>>>>
>>>>>>> Do you have a custom kernel config? Other patches? What exact
>>>>>>> commands do you run to trigger the panic?
>>>>>>>
>>>>>>>> kernel trap 12 with interrupts disabled
>>>>>>>>
>>>>>>>>
>>>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>>>> cpuid = 6; apic id = 06
>>>>>>>> fault virtual address = 0xc10
>>>>>>>> fault code = supervisor read data, page not present
>>>>>>>> instruction pointer = 0x20:0xffffffff80695e76
>>>>>>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>>>>>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>>>> processor eflags = resume, IOPL = 0
>>>>>>>> current process = 1686 (jail)
>>>>>>>> trap number = 12
>>>>>>>> panic: page fault
>>>>>>>> cpuid = 6
>>>>>>>> time = 1605811310
>>>>>>>> KDB: stack backtrace:
>>>>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>>>>
>>>>>>> This panic is rather odd. This isn’t even the bridge code.
>>>>>>> This is during initial creation of the vnet. I don’t really
>>>>>>> see how this could even trigger panics.
>>>>>>> That panic looks as if something corrupted the
>>>>>>> net_epoch_preempt, by overwriting the epoch->e_epoch. The bridge
>>>>>>> patches only access this variable through the well-established
>>>>>>> functions and macros. I see no obvious way that they could
>>>>>>> corrupt it.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Kristof
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> [hidden email] mailing list
>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>>> To unsubscribe, send any mail to
>>>>> "[hidden email]"
>>> _______________________________________________
>>> [hidden email] mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>> To unsubscribe, send any mail to
>>> "[hidden email]"
>>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Commit 367705+367706 causes a pabic

Peter Blok
Kristof,

It’s from the 2nd situation. It is so weird. Last time there was ipsec code in the backtrace, which wasn’t used on the bridge+members.

This is from my own kernel config, but during testing with the GENERIC kernel I had similar backtraces at reboot.

I can’t do a lot right now, but I’m planning to:

- build kernel with -O0
- do the deletem of the epair manually

I’ll get back to you if I find something.

Peter

> On 23 Nov 2020, at 12:15, Kristof Provost <[hidden email]> wrote:
>
> Peter,
>
> Is that backtrace from the first or the second situation you describe? What kernel config are you using with that backtrace?
>
> This backtrace does not appear to involve the bridge. Given that part of the panic message is cut off it’s very hard to conclude anything at all from it.
>
> Best regards,
> Kristof
>
> On 23 Nov 2020, at 11:52, Peter Blok wrote:
>
>> Kristof,
>>
>> With commit 367705+367706 and if_bridge statically linked. It crashes while adding an epair of a jail.
>>
>> With commit 367705+367706 and if_bridge dynamically loaded there is a crash at reboot
>>
>> #0 0xffffffff8069ddc5 at kdb_backtrace+0x65
>> #1 0xffffffff80652c8b at vpanic+0x17b
>> #2 0xffffffff80652b03 at panic+0x43
>> #3 0xffffffff809c8951 at trap_fatal+0x391
>> #4 0xffffffff809c89af at trap_pfault+0x4f
>> #5 0xffffffff809c7ff6 at trap+0x286
>> #6 0xffffffff809a1ec8 at calltrap+0x8
>> #7 0xffffffff8079f7ed at ip_input+0x63d
>> #8 0xffffffff8077a07a at netisr_dispatch_src+0xca
>> #9 0xffffffff8075a6f8 at ether_demux+0x138
>> #10 0xffffffff8075b9bb at ether_nh_input+0x33b
>> #11 0xffffffff8077a07a at netisr_dispatch_src+0xca
>> #12 0xffffffff8075ab1b at ether_input+0x4b
>> #13 0xffffffff8077a80b at swi_net+0x12b
>> #14 0xffffffff8061e10c at ithread_loop+0x23c
>> #15 0xffffffff8061afbe at fork_exit+0x7e
>> #16 0xffffffff809a2efe at fork_trampoline+0xe
>>
>> Peter
>>
>>> On 21 Nov 2020, at 17:22, Peter Blok <[hidden email]> wrote:
>>>
>>> Kristof,
>>>
>>> With a GENERIC kernel it does NOT happen. I do have a different iflib related panic at reboot, but I’ll report that separately.
>>>
>>> I brought the two config files closer together and found out that if I remove if_bridge from the config file and have it loaded dynamically when the bridge is created, the problem no longer happens and everything works ok.
>>>
>>> Peter
>>>
>>>> On 20 Nov 2020, at 15:53, Kristof Provost <[hidden email]> wrote:
>>>>
>>>> I still can’t reproduce that panic.
>>>>
>>>> Does it happen immediately after you start a vnet jail?
>>>>
>>>> Does it also happen with a GENERIC kernel?
>>>>
>>>> Regards,
>>>> Kristof
>>>>
>>>> On 20 Nov 2020, at 14:53, Peter Blok wrote:
>>>>
>>>>> The panic with ipsec code in the backtrace was already very strange. I was using IPsec, but only on one interface totally separate from the members of the bridge as well as the bridge itself. The jails were not doing any ipsec as well. Note that panic was a while ago and it was after the 1st bridge epochification was done on stable-12 which was later backed out
>>>>>
>>>>> Today the system is no longer using ipsec, but it is still compiled in. I can remove it if need be for a test
>>>>>
>>>>>
>>>>> src.conf
>>>>> WITHOUT_KERBEROS=yes
>>>>> WITHOUT_GSSAPI=yes
>>>>> WITHOUT_SENDMAIL=true
>>>>> WITHOUT_MAILWRAPPER=true
>>>>> WITHOUT_DMAGENT=true
>>>>> WITHOUT_GAMES=true
>>>>> WITHOUT_IPFILTER=true
>>>>> WITHOUT_UNBOUND=true
>>>>> WITHOUT_PROFILE=true
>>>>> WITHOUT_ATM=true
>>>>> WITHOUT_BSNMP=true
>>>>> #WITHOUT_CROSS_COMPILER=true
>>>>> WITHOUT_DEBUG_FILES=true
>>>>> WITHOUT_DICT=true
>>>>> WITHOUT_FLOPPY=true
>>>>> WITHOUT_HTML=true
>>>>> WITHOUT_HYPERV=true
>>>>> WITHOUT_NDIS=true
>>>>> WITHOUT_NIS=true
>>>>> WITHOUT_PPP=true
>>>>> WITHOUT_TALK=true
>>>>> WITHOUT_TESTS=true
>>>>> WITHOUT_WIRELESS=true
>>>>> #WITHOUT_LIB32=true
>>>>> WITHOUT_LPR=true
>>>>>
>>>>> make.conf
>>>>> KERNCONF=BHYVE
>>>>> MODULES_OVERRIDE=opensolaris dtrace zfs vmm nmdm if_bridge bridgestp if_vxlan pflog libmchain libiconv smbfs linux linux64 linux_common linuxkpi linprocfs linsysfs ext2fs
>>>>> DEFAULT_VERSIONS+=perl5=5.30 mysql=5.7 python=3.8 python3=3.8
>>>>> OPTIONS_UNSET=DOCS NLS MANPAGES
>>>>>
>>>>> BHYVE
>>>>> cpu HAMMER
>>>>> ident BHYVE
>>>>>
>>>>> makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols
>>>>> makeoptions WITH_CTF=1 # Run ctfconvert(1) for DTrace support
>>>>>
>>>>> options CAMDEBUG
>>>>>
>>>>> options SCHED_ULE # ULE scheduler
>>>>> options PREEMPTION # Enable kernel thread preemption
>>>>> options INET # InterNETworking
>>>>> options INET6 # IPv6 communications protocols
>>>>> options IPSEC
>>>>> options TCP_OFFLOAD # TCP offload
>>>>> options TCP_RFC7413 # TCP FASTOPEN
>>>>> options SCTP # Stream Control Transmission Protocol
>>>>> options FFS # Berkeley Fast Filesystem
>>>>> options SOFTUPDATES # Enable FFS soft updates support
>>>>> options UFS_ACL # Support for access control lists
>>>>> options UFS_DIRHASH # Improve performance on big directories
>>>>> options UFS_GJOURNAL # Enable gjournal-based UFS journaling
>>>>> options QUOTA # Enable disk quotas for UFS
>>>>> options SUIDDIR
>>>>> options NFSCL # Network Filesystem Client
>>>>> options NFSD # Network Filesystem Server
>>>>> options NFSLOCKD # Network Lock Manager
>>>>> options MSDOSFS # MSDOS Filesystem
>>>>> options CD9660 # ISO 9660 Filesystem
>>>>> options FUSEFS
>>>>> options NULLFS # NULL filesystem
>>>>> options UNIONFS
>>>>> options FDESCFS # File descriptor filesystem
>>>>> options PROCFS # Process filesystem (requires PSEUDOFS)
>>>>> options PSEUDOFS # Pseudo-filesystem framework
>>>>> options GEOM_PART_GPT # GUID Partition Tables.
>>>>> options GEOM_RAID # Soft RAID functionality.
>>>>> options GEOM_LABEL # Provides labelization
>>>>> options GEOM_ELI # Disk encryption.
>>>>> options COMPAT_FREEBSD32 # Compatible with i386 binaries
>>>>> options COMPAT_FREEBSD4 # Compatible with FreeBSD4
>>>>> options COMPAT_FREEBSD5 # Compatible with FreeBSD5
>>>>> options COMPAT_FREEBSD6 # Compatible with FreeBSD6
>>>>> options COMPAT_FREEBSD7 # Compatible with FreeBSD7
>>>>> options COMPAT_FREEBSD9 # Compatible with FreeBSD9
>>>>> options COMPAT_FREEBSD10 # Compatible with FreeBSD10
>>>>> options COMPAT_FREEBSD11 # Compatible with FreeBSD11
>>>>> options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
>>>>> options KTRACE # ktrace(1) support
>>>>> options STACK # stack(9) support
>>>>> options SYSVSHM # SYSV-style shared memory
>>>>> options SYSVMSG # SYSV-style message queues
>>>>> options SYSVSEM # SYSV-style semaphores
>>>>> options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions
>>>>> options PRINTF_BUFR_SIZE=128 # Prevent printf output being interspersed.
>>>>> options KBD_INSTALL_CDEV # install a CDEV entry in /dev
>>>>> options HWPMC_HOOKS # Necessary kernel hooks for hwpmc(4)
>>>>> options AUDIT # Security event auditing
>>>>> options CAPABILITY_MODE # Capsicum capability mode
>>>>> options CAPABILITIES # Capsicum capabilities
>>>>> options MAC # TrustedBSD MAC Framework
>>>>> options MAC_PORTACL
>>>>> options MAC_NTPD
>>>>> options KDTRACE_FRAME # Ensure frames are compiled in
>>>>> options KDTRACE_HOOKS # Kernel DTrace hooks
>>>>> options DDB_CTF # Kernel ELF linker loads CTF data
>>>>> options INCLUDE_CONFIG_FILE # Include this file in kernel
>>>>>
>>>>> # Debugging support.  Always need this:
>>>>> options KDB # Enable kernel debugger support.
>>>>> options KDB_TRACE # Print a stack trace for a panic.
>>>>> options KDB_UNATTENDED
>>>>>
>>>>> # Make an SMP-capable kernel by default
>>>>> options SMP # Symmetric MultiProcessor Kernel
>>>>> options EARLY_AP_STARTUP
>>>>>
>>>>> # CPU frequency control
>>>>> device cpufreq
>>>>> device cpuctl
>>>>> device coretemp
>>>>>
>>>>> # Bus support.
>>>>> device acpi
>>>>> options ACPI_DMAR
>>>>> device pci
>>>>> options PCI_IOV # PCI SR-IOV support
>>>>>
>>>>> device iicbus
>>>>> device iicbb
>>>>>
>>>>> device iic
>>>>> device ic
>>>>> device iicsmb
>>>>>
>>>>> device ichsmb
>>>>> device smbus
>>>>> device smb
>>>>>
>>>>> #device jedec_dimm
>>>>>
>>>>> # ATA controllers
>>>>> device ahci # AHCI-compatible SATA controllers
>>>>> device mvs # Marvell 88SX50XX/88SX60XX/88SX70XX/SoC SATA
>>>>>
>>>>> # SCSI Controllers
>>>>> device mps # LSI-Logic MPT-Fusion 2
>>>>>
>>>>> # ATA/SCSI peripherals
>>>>> device scbus # SCSI bus (required for ATA/SCSI)
>>>>> device da # Direct Access (disks)
>>>>> device cd # CD
>>>>> device pass # Passthrough device (direct ATA/SCSI access)
>>>>> device ses # Enclosure Services (SES and SAF-TE)
>>>>> device sg
>>>>>
>>>>> device cfiscsi
>>>>> device ctl # CAM Target Layer
>>>>> device iscsi
>>>>>
>>>>> # atkbdc0 controls both the keyboard and the PS/2 mouse
>>>>> device atkbdc # AT keyboard controller
>>>>> device atkbd # AT keyboard
>>>>> device psm # PS/2 mouse
>>>>>
>>>>> device kbdmux # keyboard multiplexer
>>>>>
>>>>> # vt is the new video console driver
>>>>> device vt
>>>>> device vt_vga
>>>>> device vt_efifb
>>>>>
>>>>> # Serial (COM) ports
>>>>> device uart # Generic UART driver
>>>>>
>>>>> # PCI/PCI-X/PCIe Ethernet NICs that use iflib infrastructure
>>>>> device iflib
>>>>> device em # Intel PRO/1000 Gigabit Ethernet Family
>>>>> device ix # Intel PRO/10GbE PCIE PF Ethernet
>>>>>
>>>>> # Network stack virtualization.
>>>>> options VIMAGE
>>>>>
>>>>> # Pseudo devices.
>>>>> device crypto
>>>>> device cryptodev
>>>>> device loop # Network loopback
>>>>> device random # Entropy device
>>>>> device padlock_rng # VIA Padlock RNG
>>>>> device rdrand_rng # Intel Bull Mountain RNG
>>>>> device ipmi
>>>>> device smbios
>>>>> device vpd
>>>>> device aesni # AES-NI OpenCrypto module
>>>>> device ether # Ethernet support
>>>>> device lagg
>>>>> device vlan # 802.1Q VLAN support
>>>>> device tuntap # Packet tunnel.
>>>>> device md # Memory "disks"
>>>>> device gif # IPv6 and IPv4 tunneling
>>>>> device firmware # firmware assist module
>>>>>
>>>>> device pf
>>>>> #device pflog
>>>>> #device pfsync
>>>>>
>>>>> # The `bpf' device enables the Berkeley Packet Filter.
>>>>> # Be aware of the administrative consequences of enabling this!
>>>>> # Note that 'bpf' is required for DHCP.
>>>>> device bpf # Berkeley packet filter
>>>>>
>>>>> # The `epair' device implements a virtual back-to-back connected Ethernet
>>>>> # like interface pair.
>>>>> device epair
>>>>>
>>>>> # USB support
>>>>> options USB_DEBUG # enable debug msgs
>>>>> device uhci # UHCI PCI->USB interface
>>>>> device ohci # OHCI PCI->USB interface
>>>>> device ehci # EHCI PCI->USB interface (USB 2.0)
>>>>> device xhci # XHCI PCI->USB interface (USB 3.0)
>>>>> device usb # USB Bus (required)
>>>>> device uhid
>>>>> device ukbd # Keyboard
>>>>> device umass # Disks/Mass storage - Requires scbus and da
>>>>> device ums
>>>>>
>>>>> device filemon
>>>>>
>>>>> device if_bridge
>>>>>
>>>>>> On 20 Nov 2020, at 12:53, Kristof Provost <[hidden email]> wrote:
>>>>>>
>>>>>> Can you share your kernel config file (and src.conf / make.conf if they exist)?
>>>>>>
>>>>>> This second panic is in the IPSec code. My current thinking is that your kernel config is triggering a bug that’s manifesting in multiple places, but not actually caused by those places.
>>>>>>
>>>>>> I’d like to be able to reproduce it so we can debug it.
>>>>>>
>>>>>> Best regards,
>>>>>> Kristof
>>>>>>
>>>>>> On 20 Nov 2020, at 12:02, Peter Blok wrote:
>>>>>>> Hi Kristof,
>>>>>>>
>>>>>>> This is 12-stable. With the previous bridge epochification that was backed out my config had a panic too.
>>>>>>>
>>>>>>> I don’t have any local modifications. I did a clean rebuild after removing /usr/obj/usr
>>>>>>>
>>>>>>> My kernel is custom - I only have zfs.ko, opensolaris.ko, vmm.ko and nmdm.ko as modules. Everything else is statically linked. I have removed all drivers not needed for the hardware at hand.
>>>>>>>
>>>>>>> My bridge is between two vlans from the same trunk and the jail epair devices as well as the bhyve tap devices.
>>>>>>>
>>>>>>> The panic happens when the jails are starting.
>>>>>>>
>>>>>>> I can try to narrow it down over the weekend and make the crash dump available for analysis.
>>>>>>>
>>>>>>> Previously I had the following crash with 363492
>>>>>>>
>>>>>>> kernel trap 12 with interrupts disabled
>>>>>>>
>>>>>>>
>>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>>> cpuid = 2; apic id = 02
>>>>>>> fault virtual address = 0xffffffff00000410
>>>>>>> fault code = supervisor read data, page not present
>>>>>>> instruction pointer = 0x20:0xffffffff80692326
>>>>>>> stack pointer        = 0x28:0xfffffe00c06097b0
>>>>>>> frame pointer        = 0x28:0xfffffe00c06097f0
>>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>>> processor eflags = resume, IOPL = 0
>>>>>>> current process = 2030 (ifconfig)
>>>>>>> trap number = 12
>>>>>>> panic: page fault
>>>>>>> cpuid = 2
>>>>>>> time = 1595683412
>>>>>>> KDB: stack backtrace:
>>>>>>> #0 0xffffffff80698165 at kdb_backtrace+0x65
>>>>>>> #1 0xffffffff8064d67b at vpanic+0x17b
>>>>>>> #2 0xffffffff8064d4f3 at panic+0x43
>>>>>>> #3 0xffffffff809cc311 at trap_fatal+0x391
>>>>>>> #4 0xffffffff809cc36f at trap_pfault+0x4f
>>>>>>> #5 0xffffffff809cb9b6 at trap+0x286
>>>>>>> #6 0xffffffff809a5b28 at calltrap+0x8
>>>>>>> #7 0xffffffff803677fd at ck_epoch_synchronize_wait+0x8d
>>>>>>> #8 0xffffffff8069213a at epoch_wait_preempt+0xaa
>>>>>>> #9 0xffffffff807615b7 at ipsec_ioctl+0x3a7
>>>>>>> #10 0xffffffff8075274f at ifioctl+0x47f
>>>>>>> #11 0xffffffff806b5ea7 at kern_ioctl+0x2b7
>>>>>>> #12 0xffffffff806b5b4a at sys_ioctl+0xfa
>>>>>>> #13 0xffffffff809ccec7 at amd64_syscall+0x387
>>>>>>> #14 0xffffffff809a6450 at fast_syscall_common+0x101
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 20 Nov 2020, at 11:30, Kristof Provost <[hidden email]> wrote:
>>>>>>>>
>>>>>>>> On 20 Nov 2020, at 11:18, [hidden email] <mailto:[hidden email]> wrote:
>>>>>>>>> I’m afraid the last Epoch fix for bridge is not solving the problem ( or perhaps creates a new ).
>>>>>>>>>
>>>>>>>> We’re talking about the stable/12 branch, right?
>>>>>>>>
>>>>>>>>> This seems to happen when the jail epair is added to the bridge.
>>>>>>>>>
>>>>>>>> There must be something more to it than that. I’ve run the bridge tests on stable/12 without issue, and this is a problem we didn’t see when the bridge epochification initially went into stable/12.
>>>>>>>>
>>>>>>>> Do you have a custom kernel config? Other patches? What exact commands do you run to trigger the panic?
>>>>>>>>
>>>>>>>>> kernel trap 12 with interrupts disabled
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Fatal trap 12: page fault while in kernel mode
>>>>>>>>> cpuid = 6; apic id = 06
>>>>>>>>> fault virtual address = 0xc10
>>>>>>>>> fault code = supervisor read data, page not present
>>>>>>>>> instruction pointer = 0x20:0xffffffff80695e76
>>>>>>>>> stack pointer        = 0x28:0xfffffe00bf14e6e0
>>>>>>>>> frame pointer        = 0x28:0xfffffe00bf14e720
>>>>>>>>> code segment = base 0x0, limit 0xfffff, type 0x1b
>>>>>>>>> = DPL 0, pres 1, long 1, def32 0, gran 1
>>>>>>>>> processor eflags = resume, IOPL = 0
>>>>>>>>> current process = 1686 (jail)
>>>>>>>>> trap number = 12
>>>>>>>>> panic: page fault
>>>>>>>>> cpuid = 6
>>>>>>>>> time = 1605811310
>>>>>>>>> KDB: stack backtrace:
>>>>>>>>> #0 0xffffffff8069bb85 at kdb_backtrace+0x65
>>>>>>>>> #1 0xffffffff80650a4b at vpanic+0x17b
>>>>>>>>> #2 0xffffffff806508c3 at panic+0x43
>>>>>>>>> #3 0xffffffff809d0351 at trap_fatal+0x391
>>>>>>>>> #4 0xffffffff809d03af at trap_pfault+0x4f
>>>>>>>>> #5 0xffffffff809cf9f6 at trap+0x286
>>>>>>>>> #6 0xffffffff809a98c8 at calltrap+0x8
>>>>>>>>> #7 0xffffffff80368a8d at ck_epoch_synchronize_wait+0x8d
>>>>>>>>> #8 0xffffffff80695c8a at epoch_wait_preempt+0xaa
>>>>>>>>> #9 0xffffffff80757d40 at vnet_if_init+0x120
>>>>>>>>> #10 0xffffffff8078c994 at vnet_alloc+0x114
>>>>>>>>> #11 0xffffffff8061e3f7 at kern_jail_set+0x1bb7
>>>>>>>>> #12 0xffffffff80620190 at sys_jail_set+0x40
>>>>>>>>> #13 0xffffffff809d0f07 at amd64_syscall+0x387
>>>>>>>>> #14 0xffffffff809aa1ee at fast_syscall_common+0xf8
>>>>>>>>
>>>>>>>> This panic is rather odd. This isn’t even the bridge code. This is during initial creation of the vnet. I don’t really see how this could even trigger panics.
>>>>>>>> That panic looks as if something corrupted the net_epoch_preempt, by overwriting the epoch->e_epoch. The bridge patches only access this variable through the well-established functions and macros. I see no obvious way that they could corrupt it.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Kristof
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> [hidden email] mailing list
>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>>>> To unsubscribe, send any mail to "[hidden email]"
>>>> _______________________________________________
>>>> [hidden email] mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
>>>> To unsubscribe, send any mail to "[hidden email]"
>>>
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"


smime.p7s (3K) Download Attachment