Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Paul Mather-3
Yesterday, I updated my RELENG_8 ZFS-only system that has worked like a champ for ages.  After a successful install{kernel,world} and reboot, I noticed the 20121130 entry in /usr/src/UPDATING and upgraded my ZFS pool via "zfs upgrade -a".  I also upgraded my boot blocks as requested, and as per the "ZFS notes" section of /usr/src/UPDATING.

Unfortunately rebooting with the upgraded pool failed.  The "windmill" boot spinner spins for a tiny amount of time and then stops dead. :-(  I don't get to the boot loader menu at all.

I downloaded a very recent RELENG_8 snapshot (FreeBSD-8.3-RELENG_8-r244923-JPSNAP-amd64-amd64-memstick.img) from pub.allbsd.org and was able to boot successfully from USB using that.  I entered Fixit Mode and tried to write the boot blocks on the memstick image onto my hard drives but the resultant system still wouldn't boot.  The commands I used (from Fixit Mode) are these:

        gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad4
        gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad6

(ad4 and ad6 are my two hard drives.)

If I "load zfs" before booting the USB memstick then I can see my old pool listed when I do "zfs import".  I haven't tried importing the pool because I'm not sure if that would make the problem worse.

Does anyone have any advice in restoring this system to bootability?  I followed the standard "root on ZFS" recipe using a two drive mirror when installing the system initially.  Each drive uses GPT with three partitions: freebsd-boot, freebsd-swap, and freebsd-zfs in that order.  Like I said at the start, all this worked for a long time until just now when I upgraded the pool to enable "feature flags" support. :-(

Any help is appreciated.

Cheers,

Paul.


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Matthew Seaman-2
On 02/01/2013 17:49, Paul Mather wrote:

> Yesterday, I updated my RELENG_8 ZFS-only system that has worked like a champ for ages.  After a successful install{kernel,world} and reboot, I noticed the 20121130 entry in /usr/src/UPDATING and upgraded my ZFS pool via "zfs upgrade -a".  I also upgraded my boot blocks as requested, and as per the "ZFS notes" section of /usr/src/UPDATING.
>
> Unfortunately rebooting with the upgraded pool failed.  The "windmill" boot spinner spins for a tiny amount of time and then stops dead. :-(  I don't get to the boot loader menu at all.
>
> I downloaded a very recent RELENG_8 snapshot (FreeBSD-8.3-RELENG_8-r244923-JPSNAP-amd64-amd64-memstick.img) from pub.allbsd.org and was able to boot successfully from USB using that.  I entered Fixit Mode and tried to write the boot blocks on the memstick image onto my hard drives but the resultant system still wouldn't boot.  The commands I used (from Fixit Mode) are these:
>
> gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad4
> gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad6
>
> (ad4 and ad6 are my two hard drives.)
>
> If I "load zfs" before booting the USB memstick then I can see my old pool listed when I do "zfs import".  I haven't tried importing the pool because I'm not sure if that would make the problem worse.
>
> Does anyone have any advice in restoring this system to bootability?  I followed the standard "root on ZFS" recipe using a two drive mirror when installing the system initially.  Each drive uses GPT with three partitions: freebsd-boot, freebsd-swap, and freebsd-zfs in that order.  Like I said at the start, all this worked for a long time until just now when I upgraded the pool to enable "feature flags" support. :-(
>
> Any help is appreciated.
I think you may be running into problems with zpool.cache.  This has
been fixed in current, which now has the ability to find the root zpool
without a valid zpool.cache, but that I suspect is faint comfort for you.

To recover from a toasted zpool.cache, you need to boot from alternate
media and then import your root zpool.  It's easiest to do that to a
temporary directory.  The important bit is to copy the zpool.cache onto
your actual zroot device:

 -- Boot from install media to 'Live CD' and log in as root (no password)

# kldload zfs               -- should load opensolaris.ko automatically
# cd /tmp                   -- this should be a writable MFS; you'll
                               need to arrange something similar if
                               not.
# zpool import -o cachefile=/tmp/zpool.cache -R /tmp/zroot zroot
                            -- this should create a zpool.cache file
# cp zpool.cache /tmp/zroot/boot/zfs/
# zfs umount -a
# shutdown -r

Eject the install media, and the system should boot up from your root zpool.

        Cheers,

        Matthew

--
Dr Matthew J Seaman MA, D.Phil.

PGP: http://www.infracaninophile.co.uk/pgpkey
JID: [hidden email]


signature.asc (274 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Paul Mather-3
On Jan 2, 2013, at 2:10 PM, Matthew Seaman <[hidden email]> wrote:

> On 02/01/2013 17:49, Paul Mather wrote:
>> Yesterday, I updated my RELENG_8 ZFS-only system that has worked like a champ for ages.  After a successful install{kernel,world} and reboot, I noticed the 20121130 entry in /usr/src/UPDATING and upgraded my ZFS pool via "zfs upgrade -a".  I also upgraded my boot blocks as requested, and as per the "ZFS notes" section of /usr/src/UPDATING.
>>
>> Unfortunately rebooting with the upgraded pool failed.  The "windmill" boot spinner spins for a tiny amount of time and then stops dead. :-(  I don't get to the boot loader menu at all.
>>
>> I downloaded a very recent RELENG_8 snapshot (FreeBSD-8.3-RELENG_8-r244923-JPSNAP-amd64-amd64-memstick.img) from pub.allbsd.org and was able to boot successfully from USB using that.  I entered Fixit Mode and tried to write the boot blocks on the memstick image onto my hard drives but the resultant system still wouldn't boot.  The commands I used (from Fixit Mode) are these:
>>
>> gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad4
>> gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad6
>>
>> (ad4 and ad6 are my two hard drives.)
>>
>> If I "load zfs" before booting the USB memstick then I can see my old pool listed when I do "zfs import".  I haven't tried importing the pool because I'm not sure if that would make the problem worse.
>>
>> Does anyone have any advice in restoring this system to bootability?  I followed the standard "root on ZFS" recipe using a two drive mirror when installing the system initially.  Each drive uses GPT with three partitions: freebsd-boot, freebsd-swap, and freebsd-zfs in that order.  Like I said at the start, all this worked for a long time until just now when I upgraded the pool to enable "feature flags" support. :-(
>>
>> Any help is appreciated.
>
> I think you may be running into problems with zpool.cache.  This has
> been fixed in current, which now has the ability to find the root zpool
> without a valid zpool.cache, but that I suspect is faint comfort for you.
>
> To recover from a toasted zpool.cache, you need to boot from alternate
> media and then import your root zpool.  It's easiest to do that to a
> temporary directory.  The important bit is to copy the zpool.cache onto
> your actual zroot device:
>
> -- Boot from install media to 'Live CD' and log in as root (no password)

Given the above, does this need to be a -CURRENT Live CD?  I've been using the RELENG_8 snapshot memstick.img mentioned in my original message above.


>
> # kldload zfs               -- should load opensolaris.ko automatically
> # cd /tmp                   -- this should be a writable MFS; you'll
>                               need to arrange something similar if
>                               not.
> # zpool import -o cachefile=/tmp/zpool.cache -R /tmp/zroot zroot
>                            -- this should create a zpool.cache file


I tried this and it complained about the pool being in use by another system---the original system that won't boot any more.  I expected this, and added "-f" to force an import.

> # cp zpool.cache /tmp/zroot/boot/zfs/


This part also failed for me.  My "zroot" fileset has a mountpoint property set to "legacy".  I had to mount this manually, via "mount -t zfs zroot /tmp/zroot" to make the /tmp/zroot/boot/zfs directory accessible.


> # zfs umount -a
> # shutdown -r
>
> Eject the install media, and the system should boot up from your root spool.


Unfortunately, it didn't boot from the root pool.  I get the same thing happening: the "windmill" spins for a very short time and then stops dead.  I don't even make it to the "BTX Loader" output, let alone the boot loader menu options. :-(

Thank you for the suggestions.

Cheers,

Paul.


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Solved?: Re: Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Paul Mather-3
In reply to this post by Matthew Seaman-2
On Jan 2, 2013, at 2:10 PM, Matthew Seaman <[hidden email]> wrote:

> On 02/01/2013 17:49, Paul Mather wrote:
>> Yesterday, I updated my RELENG_8 ZFS-only system that has worked like a champ for ages.  After a successful install{kernel,world} and reboot, I noticed the 20121130 entry in /usr/src/UPDATING and upgraded my ZFS pool via "zfs upgrade -a".  I also upgraded my boot blocks as requested, and as per the "ZFS notes" section of /usr/src/UPDATING.
>>
>> Unfortunately rebooting with the upgraded pool failed.  The "windmill" boot spinner spins for a tiny amount of time and then stops dead. :-(  I don't get to the boot loader menu at all.
>>
>> I downloaded a very recent RELENG_8 snapshot (FreeBSD-8.3-RELENG_8-r244923-JPSNAP-amd64-amd64-memstick.img) from pub.allbsd.org and was able to boot successfully from USB using that.  I entered Fixit Mode and tried to write the boot blocks on the memstick image onto my hard drives but the resultant system still wouldn't boot.  The commands I used (from Fixit Mode) are these:
>>
>> gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad4
>> gpart bootcode -b /dist/boot/pmbr -p /dist/boot/gptzfsboot -i 1 ad6
>>
>> (ad4 and ad6 are my two hard drives.)
>>
>> If I "load zfs" before booting the USB memstick then I can see my old pool listed when I do "zfs import".  I haven't tried importing the pool because I'm not sure if that would make the problem worse.
>>
>> Does anyone have any advice in restoring this system to bootability?  I followed the standard "root on ZFS" recipe using a two drive mirror when installing the system initially.  Each drive uses GPT with three partitions: freebsd-boot, freebsd-swap, and freebsd-zfs in that order.  Like I said at the start, all this worked for a long time until just now when I upgraded the pool to enable "feature flags" support. :-(
>>
>> Any help is appreciated.
>
> I think you may be running into problems with zpool.cache.  This has
> been fixed in current, which now has the ability to find the root zpool
> without a valid zpool.cache, but that I suspect is faint comfort for you.


It turns out it was my /boot.config that was preventing booting.  The system is usually always headless, so I have "-S115200 -Dh" as the sole line in /boot.config to enable a 115200 baud serial console.  This has been working fine for me up until I did a {build,install} {kernel,world} on 1st January 2013.  I was pretty sure my woes began after I did the "zpool upgrade -a" and subsequently rebooted again, but now I can't be sure whether I successfully rebooted at all after the "make installworld" and mergemaster step.

Does anyone know a sure-fire way of getting a dual console setup (high-speed serial + VGA).  The recipe I had been using had worked well for a long time.  I had "-S115200 -Dh" in /boot.config and the following entries in /boot/loader.conf:

        boot_multicons="YES"
        comconsole_speed="115200"
        console="comconsole,vidconsole"

Now, though, if I have "-S115200 -Dh" then the system locks up at boot.  Removing /boot.config gets me dual console, but only at 9600 baud. :-(

Cheers,

Paul.

PS: Is the BOOT_COMCONSOLE_SPEED entry in /etc/make.conf needed?  I was under the impression it has been obsolete for a while and took it out of my /etc/make.conf file.


_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Solved?: Re: Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Vincent Hoffman-Kazlauskas
On 03/01/2013 21:18, Paul Mather wrote:

>
>
> It turns out it was my /boot.config that was preventing booting.  The system is usually always headless, so I have "-S115200 -Dh" as the sole line in /boot.config to enable a 115200 baud serial console.  This has been working fine for me up until I did a {build,install} {kernel,world} on 1st January 2013.  I was pretty sure my woes began after I did the "zpool upgrade -a" and subsequently rebooted again, but now I can't be sure whether I successfully rebooted at all after the "make installworld" and mergemaster step.
>
> Does anyone know a sure-fire way of getting a dual console setup (high-speed serial + VGA).  The recipe I had been using had worked well for a long time.  I had "-S115200 -Dh" in /boot.config and the following entries in /boot/loader.conf:
>
> boot_multicons="YES"
> comconsole_speed="115200"
> console="comconsole,vidconsole"
>
> Now, though, if I have "-S115200 -Dh" then the system locks up at boot.  Removing /boot.config gets me dual console, but only at 9600 baud. :-(
I have
root@copia:/root # more /boot.config
-Dh

and
root@copia:/root # grep 'cons' /boot/loader.conf
console="comconsole,vidconsole"
comconsole_speed="19200"
boot_multicons="yes"

Which works fine for ipmi based serial over lan console in a generic
9.1-RELEASE.
Not sure thats that helpful but 19200 is better than 9600 ;)

Vince

>
> Cheers,
>
> Paul.
>
> PS: Is the BOOT_COMCONSOLE_SPEED entry in /etc/make.conf needed?  I was under the impression it has been obsolete for a while and took it out of my /etc/make.conf file.
>
>
> _______________________________________________
> [hidden email] mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "[hidden email]"

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Solved?: Re: Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Paul Mather-3
On Jan 4, 2013, at 5:39 AM, Vincent Hoffman <[hidden email]> wrote:

> On 03/01/2013 21:18, Paul Mather wrote:
>>
>>
>> It turns out it was my /boot.config that was preventing booting.  The system is usually always headless, so I have "-S115200 -Dh" as the sole line in /boot.config to enable a 115200 baud serial console.  This has been working fine for me up until I did a {build,install} {kernel,world} on 1st January 2013.  I was pretty sure my woes began after I did the "zpool upgrade -a" and subsequently rebooted again, but now I can't be sure whether I successfully rebooted at all after the "make installworld" and mergemaster step.
>>
>> Does anyone know a sure-fire way of getting a dual console setup (high-speed serial + VGA).  The recipe I had been using had worked well for a long time.  I had "-S115200 -Dh" in /boot.config and the following entries in /boot/loader.conf:
>>
>> boot_multicons="YES"
>> comconsole_speed="115200"
>> console="comconsole,vidconsole"
>>
>> Now, though, if I have "-S115200 -Dh" then the system locks up at boot.  Removing /boot.config gets me dual console, but only at 9600 baud. :-(
> I have
> root@copia:/root # more /boot.config
> -Dh
>
> and
> root@copia:/root # grep 'cons' /boot/loader.conf
> console="comconsole,vidconsole"
> comconsole_speed="19200"
> boot_multicons="yes"
>
> Which works fine for ipmi based serial over lan console in a generic
> 9.1-RELEASE.
> Not sure thats that helpful but 19200 is better than 9600 ;)


That's weird.  I have the same basic /boot/loader.conf entries (except for a speed of 115200) and even just putting "-Dh" into /boot.config leads to the same unbootable system behaviour. :-(

Maybe something broke recently in the RELENG_8 boot loader?  Like I said originally, that /boot.config entry had been working for me without problems for a long time.

Cheers,

Paul.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Solved?: Re: Upgrade of RELENG_8 ZFS boot pool leads to unbootable system

Bruce A. Mah-2
If memory serves me right, Paul Mather wrote:

> That's weird.  I have the same basic /boot/loader.conf entries
> (except for a speed of 115200) and even just putting "-Dh" into
> /boot.config leads to the same unbootable system behaviour. :-(
>
> Maybe something broke recently in the RELENG_8 boot loader?  Like I
> said originally, that /boot.config entry had been working for me
> without problems for a long time.

Hi Paul--

I recently ran into this as well.  I didn't do any setting of the
console bitrate, just "-Dh" in /boot.config, and it caused a server I
was upgrading to lock up just as you described.  I had to learn about
mounting ZFS filesystems from the Fixit environment in a hurry.  My
workaround was to get rid of /boot.config, which obviously doesn't
restore the original functionality.

The system in question was (is) running 8.3-RELEASE.

Bruce.



signature.asc (266 bytes) Download Attachment