more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

freebsd-stable mailing list
After discussion with Bob Bishop (thanks for the help!) I've tried to do
the following to upgrade one of the old boxes I mentioned previously.

cd /usr/src
tar ... .
rm -rf .??* *
svn checkout httpg://svn.freebsd.org/base/releng/10.3 /usr/src
compile, installkernel, installworld...

Now that the host is running RELENG the next step was to update from
10.4 to 11.2 via freebsd-update

freebsd-update
freebsd-install
freebsd-update upgrade -r 11.2-RELEASE
freebsd-update install

so far, so good. Now it all falls apart

shutdown -r now
... why isn't the host coming back? Oh look, kernel panic.

   Fatal trap 12: page fault while in kernel mode
   cpuid = 1; apci id = 01
   fault virtual address = 0x84
   fault code = supervisor read data, page not present

Google searches find references to the same panic type in VMs running
11.1, including https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220923

The differences are, that's 11.1 not 11.2 (I would presume the fix made
it into 11.2 but maybe not) and most notably, that's against VMs and the
host I'm doing this on is bare iron (Sun x4500).

Still, I gave the two entries in /boot/loader.conf a try, no joy.
Exactly the same panic. Recording the boot with slow-mo shows the panic
happening just after the USB devices are enumerated by the kernel. It
never even tries to mount root.

I am able to boot to kernel.old, which appears to be my old 10.4-STABLE
kernel. So now I'm kind of stuck. The update has already modified the
config files as part of the first pass so rolling back may be a problem
and moving forward seems unwise.

I have only one x4500 but I have three x4540s running 11.2-STABLE (also
installed from source) just fine.

Anyone have any brilliant suggestions? I'm thinking of trying to compile
11.2-RELENG in /usr/src so I can try installing that kernel but that'll
take several hours at least (it's an old box).

nomad
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

Miroslav Lachman
Lee Damon via freebsd-stable wrote on 2019/03/01 22:53:

> After discussion with Bob Bishop (thanks for the help!) I've tried to do
> the following to upgrade one of the old boxes I mentioned previously.
>
> cd /usr/src
> tar ... .
> rm -rf .??* *
> svn checkout httpg://svn.freebsd.org/base/releng/10.3 /usr/src
> compile, installkernel, installworld...
>
> Now that the host is running RELENG the next step was to update from
> 10.4 to 11.2 via freebsd-update
>
> freebsd-update
> freebsd-install
> freebsd-update upgrade -r 11.2-RELEASE
> freebsd-update install
>
> so far, so good. Now it all falls apart
>
> shutdown -r now
> ... why isn't the host coming back? Oh look, kernel panic.
>
>    Fatal trap 12: page fault while in kernel mode
>    cpuid = 1; apci id = 01
>    fault virtual address = 0x84
>    fault code = supervisor read data, page not present

I went back from freebsd-update to source upgrades few years ago and now
use exclusively source builds (build it on powerful build machine and
distribute it to clients thru NFS so clients can just run make
installkernel and make installworld) because I was bitten by failed
freebsd-update upgrade many times...

> Google searches find references to the same panic type in VMs running
> 11.1, including https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=220923
>
> The differences are, that's 11.1 not 11.2 (I would presume the fix made
> it into 11.2 but maybe not) and most notably, that's against VMs and the
> host I'm doing this on is bare iron (Sun x4500).
>
> Still, I gave the two entries in /boot/loader.conf a try, no joy.
> Exactly the same panic. Recording the boot with slow-mo shows the panic
> happening just after the USB devices are enumerated by the kernel. It
> never even tries to mount root.
>
> I am able to boot to kernel.old, which appears to be my old 10.4-STABLE
> kernel. So now I'm kind of stuck. The update has already modified the
> config files as part of the first pass so rolling back may be a problem
> and moving forward seems unwise.
>
> I have only one x4500 but I have three x4540s running 11.2-STABLE (also
> installed from source) just fine.
>
> Anyone have any brilliant suggestions? I'm thinking of trying to compile
> 11.2-RELENG in /usr/src so I can try installing that kernel but that'll
> take several hours at least (it's an old box).

If you can boot with the old 10.4 kernel and go online, just fetch
kernel.txz from the net:
http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/11.2-RELEASE/kernel.txz 
and unpack it to /boot/kernel112 then you can try to reboot a manually
select to boot this kernel instead of default /boot/kernel.
If you cannot access the boot loader prompt you can try "nextboot" command.
1) unpack the kernel
2) set nextboot: nextboot -k kernel112
3) shutdown -r now and hope for a luck

If your machine boots fine with 11.2 kernel, you can fetch sources and
rebuild kernel and userland for 11.2 as usual.
Or you can try to fetch and unpack base.txz
http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/11.2-RELEASE/base.txz 
over your current files. It can make a mess but you can always clean it
with "make delete-old & make delete-old-libs"

Miroslav Lachman
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

freebsd-stable mailing list
On 3/1/19 14:19 , Miroslav Lachman wrote:
> If you can boot with the old 10.4 kernel and go online, just fetch
> kernel.txz from the net:
> http://ftp.freebsd.org/pub/FreeBSD/releases/amd64/11.2-RELEASE/kernel.txz and
> unpack it to /boot/kernel112 then you can try to reboot a manually
> select to boot this kernel instead of default /boot/kernel.

Darn it. I get the same kernel panic with that one.

I'm compiling locally but I don't expect that to make any difference.
I'll need to go pawing through the release notes and see if there are
any references to deprecated hardware that might be involved.

I'm attaching a copy of dmesg output from a successful boot into
10.4-STABLE. The kernel panic appears to happen around 15% of the way
into the output, around

...
mvsch13: <Marvell SATA channel> at channel 5 on mvs1
mvsch14: <Marvell SATA channel> at channel 6 on mvs1
mvsch15: <Marvell SATA channel> at channel 7 on mvs1
pcib3: <ACPI PCI-PCI bridge> at device 6.0 on pci0
pci3: <ACPI PCI bus> on pcib3
ohci0: <OHCI (generic) USB controller> mem 0xfd1fe000-0xfd1fefff irq 19
at device 0.0 on pci3
usbus0 on ohci0
ohci1: <OHCI (generic) USB controller> mem 0xfd1fd000-0xfd1fdfff irq 19
at device 0.1 on pci3
usbus1 on ohci1
...

(Just before it enumerates vgapci0)

but I can't be sure because the screen moves so fast that even slow-mo
video is just a blur.

nomad

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"

goose_dmesg.txt (37K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

Miroslav Lachman
Lee Damon wrote on 2019/03/02 00:06:

> Darn it. I get the same kernel panic with that one.
>
> I'm compiling locally but I don't expect that to make any difference.
> I'll need to go pawing through the release notes and see if there are
> any references to deprecated hardware that might be involved.
>
> I'm attaching a copy of dmesg output from a successful boot into
> 10.4-STABLE. The kernel panic appears to happen around 15% of the way
> into the output, around

I am running 11.2 on SunFire X2100 M2 but according to your dmesg it
uses different chips. X2100 M2 has nVidia nForce MCP55 chipset for ATA
devices, nfe for 2 NICs and Broadcom bge for the other 2 NIC's.

Did you tried to boot "safe mode"? (selectable in boot menu).
Or you can try to disable / enable some settings in the BIOS. Something
related to USB or onboard VGA etc. may help.

Miroslav Lachman
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

freebsd-stable mailing list
On 3/1/19 15:38 , Miroslav Lachman wrote:
> Did you tried to boot "safe mode"? (selectable in boot menu).

I completely forgot about safe mode.

Yep. It boots. I'm going to finish the freebsd-update process then
reboot into safe mode again. I'm out of time to work on this today and
am only in this lab on Fridays so I'll have to pick up working on this
problem next Friday.

Thanks for the help,
nomad
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: more fun, upgrading from 10.3-STABLE 10.4-RELENG to 11.2-RELENG - kernel panic

Miroslav Lachman
Lee Damon wrote on 2019/03/02 01:36:
> On 3/1/19 15:38 , Miroslav Lachman wrote:
>> Did you tried to boot "safe mode"? (selectable in boot menu).
>
> I completely forgot about safe mode.
>
> Yep. It boots. I'm going to finish the freebsd-update process then
> reboot into safe mode again. I'm out of time to work on this today and
> am only in this lab on Fridays so I'll have to pick up working on this
> problem next Friday.

Glad to know something finally works :)

You can look in to /boot/menu-commands.4th there is definition what Safe
Mode disable


: safemode_enabled? ( -- flag )
         s" kern.smp.disabled" getenv -1 <> dup if
                 swap drop ( c-addr flag -- flag )
         then
;

: safemode_enable ( -- )
         s" set kern.smp.disabled=1" evaluate
         s" set hw.ata.ata_dma=0" evaluate
         s" set hw.ata.atapi_dma=0" evaluate
         s" set hw.ata.wc=0" evaluate
         s" set hw.eisa_slots=0" evaluate
         s" set kern.eventtimer.periodic=1" evaluate
         s" set kern.geom.part.check_integrity=0" evaluate
;

You can play with these items one by one to find what is the root cause
in your case.

Miroslav Lachman
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"