Bug ID: 222288
Summary: g_bio leak after zfs ABD commit
Product: Base System
Severity: Affects Many People
Assignee: [hidden email] Reporter: [hidden email]
It looks like the ABD commit may have introduced a leak of geom bio structs.
It's a slow leak so it's not obvious, but after a few weeks of uptime, one of
my 8GB-RAM systems had accumulated a couple million g_bio objects according to
vmstat (over 2GB of RAM) and was starting to swap. I rebooted it, and after 11
hours, I'm up to 310182:
Looking at base r321610 itself, it looks like there's a g_bio_destroy() call
that got relocated from vdev_geom_io_intr() to vdev_geom_io_done(); maybe there
are cases where vdev_geom_io_intr is called, but vdev_geom_io_done isn't? I
don't know enough about ZFS internals to get any farther than this.
Rolling the kernel back to r321609 makes the leak stop, and updating to r321610
makes it appear again.
--- Comment #1 from Andriy Gapon <[hidden email]> ---
(In reply to Dan Nelson from comment #0)
Thank you very much for the report!
As soon as I read it I noticed that I have the same kind of issue. I dug
through the biozone to examine the leaked bio-s and they all seem to have
bio_cmd = 5 and bio_flags = 8. So, they seem to be BIO_FLUSH bio-s. Their
zio-s must have been ZIO_TYPE_IOCTL.
Now, those zio-s use ZIO_IOCTL_PIPELINE and it is defined as:
#define ZIO_IOCTL_PIPELINE \
(ZIO_INTERLOCK_STAGES | \
ZIO_STAGE_VDEV_IO_START | \
So, ZIO_STAGE_VDEV_IO_START is in the pipeline, but ZIO_STAGE_VDEV_IO_DONE is
not as you have correctly theorised.
The normal I/O pipelines always include ZIO_VDEV_IO_STAGES
#define ZIO_VDEV_IO_STAGES \
(ZIO_STAGE_VDEV_IO_START | \
ZIO_STAGE_VDEV_IO_DONE | \
so the problem does not affect them.
I will double-check why the ioctl pipeline omitted ZIO_STAGE_VDEV_IO_DONE and
will test adding that stage.