Reducing UFS corruption from unclean shutdowns?

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Reducing UFS corruption from unclean shutdowns?

Alan Somers-2
I panic my development VM regularly.  Each time, I need to fsck the
file system.  Even if I had run sync(8) just before the panic, I
frequently find corruption.  What should I change to make sync(8)
work, or at least to make corruption rare?  It looks like my root file
system is using soft-updates+journal.  Should I disable those?

-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Scott Long-2


> On Jun 21, 2019, at 1:49 PM, Alan Somers <[hidden email]> wrote:
>
> I panic my development VM regularly.  Each time, I need to fsck the
> file system.  Even if I had run sync(8) just before the panic, I
> frequently find corruption.  What should I change to make sync(8)
> work, or at least to make corruption rare?  It looks like my root file
> system is using soft-updates+journal.  Should I disable those?
>

What corruption do you regularly see?

Scott


_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Alan Somers-2
On Fri, Jun 21, 2019 at 1:56 PM Scott Long <[hidden email]> wrote:

>
>
>
> > On Jun 21, 2019, at 1:49 PM, Alan Somers <[hidden email]> wrote:
> >
> > I panic my development VM regularly.  Each time, I need to fsck the
> > file system.  Even if I had run sync(8) just before the panic, I
> > frequently find corruption.  What should I change to make sync(8)
> > work, or at least to make corruption rare?  It looks like my root file
> > system is using soft-updates+journal.  Should I disable those?
> >
>
> What corruption do you regularly see?
>
> Scott

fsck reports various types of errors, all repairable, like "INODE
CHECK-HASH FAILED", "FREE BLK COUNT(S) WRONG IN SUPERBLK", "SUMMARY
INFORMATION BAD", "BLK(S) MISSING IN BIT MAPS", and "UNREF FILE".  If
I don't run fsck, then I get errors when I try to access files.  Like
"inode XXX: check-hash failed" and "such and such is marked as an
executable file but could not be run by the operating system".
-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Scott Long-2

> On Jun 21, 2019, at 2:09 PM, Alan Somers <[hidden email]> wrote:
>
> On Fri, Jun 21, 2019 at 1:56 PM Scott Long <[hidden email]> wrote:
>>
>>
>>
>>> On Jun 21, 2019, at 1:49 PM, Alan Somers <[hidden email]> wrote:
>>>
>>> I panic my development VM regularly.  Each time, I need to fsck the
>>> file system.  Even if I had run sync(8) just before the panic, I
>>> frequently find corruption.  What should I change to make sync(8)
>>> work, or at least to make corruption rare?  It looks like my root file
>>> system is using soft-updates+journal.  Should I disable those?
>>>
>>
>> What corruption do you regularly see?
>>
>> Scott
>
> fsck reports various types of errors, all repairable, like "INODE
> CHECK-HASH FAILED", "FREE BLK COUNT(S) WRONG IN SUPERBLK", "SUMMARY
> INFORMATION BAD", "BLK(S) MISSING IN BIT MAPS", and "UNREF FILE".  If
> I don't run fsck, then I get errors when I try to access files.  Like
> "inode XXX: check-hash failed" and "such and such is marked as an
> executable file but could not be run by the operating system".
> -Alan

The freeblk count and summary information messages are normal and expected.  I
don’t think that the blks missing message is expected, and the unref file message is
definitely a red flag of something that should have been handed with journal
recovery.  Kirk and Chuck, do you have any insight here?

Thanks,
Scott

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Don Lewis-5
On 21 Jun, Scott Long wrote:

>
>> On Jun 21, 2019, at 2:09 PM, Alan Somers <[hidden email]> wrote:
>>
>> On Fri, Jun 21, 2019 at 1:56 PM Scott Long <[hidden email]> wrote:
>>>
>>>
>>>
>>>> On Jun 21, 2019, at 1:49 PM, Alan Somers <[hidden email]> wrote:
>>>>
>>>> I panic my development VM regularly.  Each time, I need to fsck the
>>>> file system.  Even if I had run sync(8) just before the panic, I
>>>> frequently find corruption.  What should I change to make sync(8)
>>>> work, or at least to make corruption rare?  It looks like my root file
>>>> system is using soft-updates+journal.  Should I disable those?
>>>>
>>>
>>> What corruption do you regularly see?
>>>
>>> Scott
>>
>> fsck reports various types of errors, all repairable, like "INODE
>> CHECK-HASH FAILED", "FREE BLK COUNT(S) WRONG IN SUPERBLK", "SUMMARY
>> INFORMATION BAD", "BLK(S) MISSING IN BIT MAPS", and "UNREF FILE".  If
>> I don't run fsck, then I get errors when I try to access files.  Like
>> "inode XXX: check-hash failed" and "such and such is marked as an
>> executable file but could not be run by the operating system".
>> -Alan
>
> The freeblk count and summary information messages are normal and expected.  I
> don’t think that the blks missing message is expected, and the unref file message is
> definitely a red flag of something that should have been handed with journal
> recovery.  Kirk and Chuck, do you have any insight here?

Blks missing is to be expected.  The free block bitmap isn't updated
until after the pointers to them in the inode are cleared.  Likewise the
unref file warning is also to be expected because the reference to the
inode in the parent directory is cleared before the inode is cleared.
These aren't a fatal problem, just a resource leak until fsck is run.

I would not expect the inode check-hash error.  I would expect the hash
update to happen at the same time as any other bits of the inode are
changed, but this is a new feature that went in after I last looked at
the code.

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Scott Long-2


> On Jun 21, 2019, at 3:49 PM, Don Lewis <[hidden email]> wrote:
>
> On 21 Jun, Scott Long wrote:
>>
>>> On Jun 21, 2019, at 2:09 PM, Alan Somers <[hidden email]> wrote:
>>>
>>> On Fri, Jun 21, 2019 at 1:56 PM Scott Long <[hidden email]> wrote:
>>>>
>>>>
>>>>
>>>>> On Jun 21, 2019, at 1:49 PM, Alan Somers <[hidden email]> wrote:
>>>>>
>>>>> I panic my development VM regularly.  Each time, I need to fsck the
>>>>> file system.  Even if I had run sync(8) just before the panic, I
>>>>> frequently find corruption.  What should I change to make sync(8)
>>>>> work, or at least to make corruption rare?  It looks like my root file
>>>>> system is using soft-updates+journal.  Should I disable those?
>>>>>
>>>>
>>>> What corruption do you regularly see?
>>>>
>>>> Scott
>>>
>>> fsck reports various types of errors, all repairable, like "INODE
>>> CHECK-HASH FAILED", "FREE BLK COUNT(S) WRONG IN SUPERBLK", "SUMMARY
>>> INFORMATION BAD", "BLK(S) MISSING IN BIT MAPS", and "UNREF FILE".  If
>>> I don't run fsck, then I get errors when I try to access files.  Like
>>> "inode XXX: check-hash failed" and "such and such is marked as an
>>> executable file but could not be run by the operating system".
>>> -Alan
>>
>> The freeblk count and summary information messages are normal and expected.  I
>> don’t think that the blks missing message is expected, and the unref file message is
>> definitely a red flag of something that should have been handed with journal
>> recovery.  Kirk and Chuck, do you have any insight here?
>
> Blks missing is to be expected.  The free block bitmap isn't updated
> until after the pointers to them in the inode are cleared.  Likewise the
> unref file warning is also to be expected because the reference to the
> inode in the parent directory is cleared before the inode is cleared.
> These aren't a fatal problem, just a resource leak until fsck is run.
>
> I would not expect the inode check-hash error.  I would expect the hash
> update to happen at the same time as any other bits of the inode are
> changed, but this is a new feature that went in after I last looked at
> the code.
>

I thought that unref’d files were also supposed to be cleaned up on journal recovery,
different from plain SU recovery/preening.  It’s been so long, maybe I don’t remember
correctly anymore.

Scott

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Alan Somers-2
On Fri, Jun 21, 2019 at 3:51 PM Scott Long <[hidden email]> wrote:

>
>
>
> > On Jun 21, 2019, at 3:49 PM, Don Lewis <[hidden email]> wrote:
> >
> > On 21 Jun, Scott Long wrote:
> >>
> >>> On Jun 21, 2019, at 2:09 PM, Alan Somers <[hidden email]> wrote:
> >>>
> >>> On Fri, Jun 21, 2019 at 1:56 PM Scott Long <[hidden email]> wrote:
> >>>>
> >>>>
> >>>>
> >>>>> On Jun 21, 2019, at 1:49 PM, Alan Somers <[hidden email]> wrote:
> >>>>>
> >>>>> I panic my development VM regularly.  Each time, I need to fsck the
> >>>>> file system.  Even if I had run sync(8) just before the panic, I
> >>>>> frequently find corruption.  What should I change to make sync(8)
> >>>>> work, or at least to make corruption rare?  It looks like my root file
> >>>>> system is using soft-updates+journal.  Should I disable those?
> >>>>>
> >>>>
> >>>> What corruption do you regularly see?
> >>>>
> >>>> Scott
> >>>
> >>> fsck reports various types of errors, all repairable, like "INODE
> >>> CHECK-HASH FAILED", "FREE BLK COUNT(S) WRONG IN SUPERBLK", "SUMMARY
> >>> INFORMATION BAD", "BLK(S) MISSING IN BIT MAPS", and "UNREF FILE".  If
> >>> I don't run fsck, then I get errors when I try to access files.  Like
> >>> "inode XXX: check-hash failed" and "such and such is marked as an
> >>> executable file but could not be run by the operating system".
> >>> -Alan
> >>
> >> The freeblk count and summary information messages are normal and expected.  I
> >> don’t think that the blks missing message is expected, and the unref file message is
> >> definitely a red flag of something that should have been handed with journal
> >> recovery.  Kirk and Chuck, do you have any insight here?
> >
> > Blks missing is to be expected.  The free block bitmap isn't updated
> > until after the pointers to them in the inode are cleared.  Likewise the
> > unref file warning is also to be expected because the reference to the
> > inode in the parent directory is cleared before the inode is cleared.
> > These aren't a fatal problem, just a resource leak until fsck is run.
> >
> > I would not expect the inode check-hash error.  I would expect the hash
> > update to happen at the same time as any other bits of the inode are
> > changed, but this is a new feature that went in after I last looked at
> > the code.
> >
>
> I thought that unref’d files were also supposed to be cleaned up on journal recovery,
> different from plain SU recovery/preening.  It’s been so long, maybe I don’t remember
> correctly anymore.
>
> Scott

I would've thought that immediately following a sync(8), the
filesystem would be consistent.  Why do I still see errors after a
panic in files that were written before I sync()ed?
-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Conrad Meyer-2
On Fri, Jun 21, 2019 at 2:55 PM Alan Somers <[hidden email]> wrote:
> I would've thought that immediately following a sync(8), the
> filesystem would be consistent.  Why do I still see errors after a
> panic in files that were written before I sync()ed?
> -Alan

Hi Alan,

Contra the name, sync(2) (sync(8)) isn't synchronous.  It invokes
VFS_SYNC() with MNT_NOWAIT across all mountpoints.

Cheers,
Conrad
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Simon J. Gerraty
In reply to this post by Alan Somers-2
Alan Somers <[hidden email]> wrote:
> I would've thought that immediately following a sync(8), the
> filesystem would be consistent.  Why do I still see errors after a

sync(8) does little more than tell the kernel to start flushing some
pages - which the kernel would do anyway in about 30s
So, it does not really ensure a clean filesystem if you are about to
reboot - and certainly not if a panic is imminent.

FWIW to minimize fs problems after doing package install on junos I run
fsync on files and dirs I know are likely to have been updated and which
need to be flushed to disk before reboot.

Unlike sync(8), fsync(1) will not return until the I/O is done.

You may still lose data after a sudden power cycle or panic, but less
likely to be something critical.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Warner Losh
In reply to this post by Conrad Meyer-2
On Fri, Jun 21, 2019, 3:33 PM Conrad Meyer <[hidden email]> wrote:

> On Fri, Jun 21, 2019 at 2:55 PM Alan Somers <[hidden email]> wrote:
> > I would've thought that immediately following a sync(8), the
> > filesystem would be consistent.  Why do I still see errors after a
> > panic in files that were written before I sync()ed?
> > -Alan
>
> Hi Alan,
>
> Contra the name, sync(2) (sync(8)) isn't synchronous.  It invokes
> VFS_SYNC() with MNT_NOWAIT across all mountpoints.
>

Yes. Sync(2) just starts the I/O, but it may be delayed if there is a lot
of dirty buffers. The other issue is that new buffers may be dirtied...

Warner

Cheers,
> Conrad
> _______________________________________________
> [hidden email] mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "[hidden email]"
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Scott Long-2


> On Jun 21, 2019, at 4:37 PM, Warner Losh <[hidden email]> wrote:
>
> On Fri, Jun 21, 2019, 3:33 PM Conrad Meyer <[hidden email]> wrote:
>
>> On Fri, Jun 21, 2019 at 2:55 PM Alan Somers <[hidden email]> wrote:
>>> I would've thought that immediately following a sync(8), the
>>> filesystem would be consistent.  Why do I still see errors after a
>>> panic in files that were written before I sync()ed?
>>> -Alan
>>
>> Hi Alan,
>>
>> Contra the name, sync(2) (sync(8)) isn't synchronous.  It invokes
>> VFS_SYNC() with MNT_NOWAIT across all mountpoints.
>>
>
> Yes. Sync(2) just starts the I/O, but it may be delayed if there is a lot
> of dirty buffers. The other issue is that new buffers may be dirtied…
>

Still, the point of SU and SU+J is that the filesystem should not be
damaged and require active repair on reboot, whether or not a
sync or fsync was done.  There’s certainly issues with disk lying
about out of order writes, POSIX sematics of unlinked files, and the
inherent design of UFS superblock updates, but the problems that
Alan reported should still be looked at, they’re not expected and
they undermine the usefulness of SU+J.

Scott



_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Warner Losh
On Fri, Jun 21, 2019, 3:44 PM Scott Long <[hidden email]> wrote:

>
>
> > On Jun 21, 2019, at 4:37 PM, Warner Losh <[hidden email]> wrote:
> >
> > On Fri, Jun 21, 2019, 3:33 PM Conrad Meyer <[hidden email]> wrote:
> >
> >> On Fri, Jun 21, 2019 at 2:55 PM Alan Somers <[hidden email]>
> wrote:
> >>> I would've thought that immediately following a sync(8), the
> >>> filesystem would be consistent.  Why do I still see errors after a
> >>> panic in files that were written before I sync()ed?
> >>> -Alan
> >>
> >> Hi Alan,
> >>
> >> Contra the name, sync(2) (sync(8)) isn't synchronous.  It invokes
> >> VFS_SYNC() with MNT_NOWAIT across all mountpoints.
> >>
> >
> > Yes. Sync(2) just starts the I/O, but it may be delayed if there is a lot
> > of dirty buffers. The other issue is that new buffers may be dirtied…
> >
>
> Still, the point of SU and SU+J is that the filesystem should not be
> damaged and require active repair on reboot, whether or not a
> sync or fsync was done.  There’s certainly issues with disk lying
> about out of order writes, POSIX sematics of unlinked files, and the
> inherent design of UFS superblock updates, but the problems that
> Alan reported should still be looked at, they’re not expected and
> they undermine the usefulness of SU+J.
>

Yea. Ata write cache might cause it. But only once in a while and usually
only with power fail. Some drives / devices need a final flush, so that
might be an issue. I fixed an issue in nvme on shutdown like this, but
panic should trigger that code...

Warner

Scott
>
>
>
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Alan Somers-2
On Fri, Jun 21, 2019 at 4:50 PM Warner Losh <[hidden email]> wrote:

>
>
>
> On Fri, Jun 21, 2019, 3:44 PM Scott Long <[hidden email]> wrote:
>>
>>
>>
>> > On Jun 21, 2019, at 4:37 PM, Warner Losh <[hidden email]> wrote:
>> >
>> > On Fri, Jun 21, 2019, 3:33 PM Conrad Meyer <[hidden email]> wrote:
>> >
>> >> On Fri, Jun 21, 2019 at 2:55 PM Alan Somers <[hidden email]> wrote:
>> >>> I would've thought that immediately following a sync(8), the
>> >>> filesystem would be consistent.  Why do I still see errors after a
>> >>> panic in files that were written before I sync()ed?
>> >>> -Alan
>> >>
>> >> Hi Alan,
>> >>
>> >> Contra the name, sync(2) (sync(8)) isn't synchronous.  It invokes
>> >> VFS_SYNC() with MNT_NOWAIT across all mountpoints.
>> >>
>> >
>> > Yes. Sync(2) just starts the I/O, but it may be delayed if there is a lot
>> > of dirty buffers. The other issue is that new buffers may be dirtied…
>> >
>>
>> Still, the point of SU and SU+J is that the filesystem should not be
>> damaged and require active repair on reboot, whether or not a
>> sync or fsync was done.  There’s certainly issues with disk lying
>> about out of order writes, POSIX sematics of unlinked files, and the
>> inherent design of UFS superblock updates, but the problems that
>> Alan reported should still be looked at, they’re not expected and
>> they undermine the usefulness of SU+J.
>
>
> Yea. Ata write cache might cause it. But only once in a while and usually only with power fail. Some drives / devices need a final flush, so that might be an issue. I fixed an issue in nvme on shutdown like this, but panic should trigger that code...
>
> Warner

No ATA write cache here.  I'm using bhyve with a virtio disk.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

RW-15
In reply to this post by Alan Somers-2
On Fri, 21 Jun 2019 13:49:30 -0600
Alan Somers wrote:

> I panic my development VM regularly.  Each time, I need to fsck the
> file system.  

I've found UFS on gjournal to be reliable.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Reducing UFS corruption from unclean shutdowns?

Don Lewis-5
In reply to this post by Scott Long-2
On 21 Jun, Scott Long wrote:

>
>
>> On Jun 21, 2019, at 4:37 PM, Warner Losh <[hidden email]> wrote:
>>
>> On Fri, Jun 21, 2019, 3:33 PM Conrad Meyer <[hidden email]> wrote:
>>
>>> On Fri, Jun 21, 2019 at 2:55 PM Alan Somers <[hidden email]> wrote:
>>>> I would've thought that immediately following a sync(8), the
>>>> filesystem would be consistent.  Why do I still see errors after a
>>>> panic in files that were written before I sync()ed?
>>>> -Alan
>>>
>>> Hi Alan,
>>>
>>> Contra the name, sync(2) (sync(8)) isn't synchronous.  It invokes
>>> VFS_SYNC() with MNT_NOWAIT across all mountpoints.
>>>
>>
>> Yes. Sync(2) just starts the I/O, but it may be delayed if there is a lot
>> of dirty buffers. The other issue is that new buffers may be dirtied…
>>
>
> Still, the point of SU and SU+J is that the filesystem should not be
> damaged and require active repair on reboot, whether or not a
> sync or fsync was done.  There’s certainly issues with disk lying
> about out of order writes, POSIX sematics of unlinked files, and the
> inherent design of UFS superblock updates, but the problems that
> Alan reported should still be looked at, they’re not expected and
> they undermine the usefulness of SU+J.

Other that the inode hash error, the other issues should not prevent
safely mounting the filesystem read-write.  SU without J is able to fix
these problems with a background fsck while the filesystem is mounted
and in use.

SU+J should be able to fix all of these except for the inode hash error
by replaying the journal, but that is done by fsck.  At least it can
avoid the need to scan the entire filesystem.

The problem of the disk lying about write completions should only be a
problem if the power fails, or if we do something during the panic and
recovery that tells the disk to toss its write cache.

The main problem here is the inode hash error.  That shouldn't be
happening.

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[hidden email]"