Step 2 Was: more strict KPI for vm_pager_get_pages()

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Step 2 Was: more strict KPI for vm_pager_get_pages()

Gleb Smirnoff
  Hi!

  This is step 2 of the "more strict pager KPI" patch:

o Uninline vm_pager_get_pages() into vm_pager.c.
o Keep all KASSERTs that enforce the KPI of pagers and of their
  consumers in one place: vm_pager_get_pages().
o Keep the code that zeroes out partially valid pages in one
  place in vm_pager_get_pages().

--
Totus tuus, Glebius.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"

pagers.step.2.diff (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Step 2 Was: more strict KPI for vm_pager_get_pages()

Gleb Smirnoff
  Hi!

On Tue, Jun 16, 2015 at 12:29:31AM +0300, Gleb Smirnoff wrote:
T>   This is step 2 of the "more strict pager KPI" patch:
T>
T> o Uninline vm_pager_get_pages() into vm_pager.c.
T> o Keep all KASSERTs that enforce the KPI of pagers and of their
T>   consumers in one place: vm_pager_get_pages().
T> o Keep the code that zeroes out partially valid pages in one
T>   place in vm_pager_get_pages().

As Konstantin pointed out, it is very likely that right now NFS and SMD
do the zeroing of partially valid pages incorrecrtly. So this code
is removed from the generic vm_pager_get_pages() and this reduces
step 2 patch to a very small change:

o Uninline vm_pager_get_pages() into vm_pager.c.
o Keep all KASSERTs that enforce the KPI of pagers and of their
  consumers in one place: vm_pager_get_pages().
o Remove some KASSERTs from pagers that are already there in
  vm_pager_get_pages().

Patch attached.

--
Totus tuus, Glebius.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"

pagers.step.2.diff (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Step 2 Was: more strict KPI for vm_pager_get_pages()

Konstantin Belousov
On Wed, Jun 17, 2015 at 01:14:45PM +0300, Gleb Smirnoff wrote:

>   Hi!
>
> On Tue, Jun 16, 2015 at 12:29:31AM +0300, Gleb Smirnoff wrote:
> T>   This is step 2 of the "more strict pager KPI" patch:
> T>
> T> o Uninline vm_pager_get_pages() into vm_pager.c.
> T> o Keep all KASSERTs that enforce the KPI of pagers and of their
> T>   consumers in one place: vm_pager_get_pages().
> T> o Keep the code that zeroes out partially valid pages in one
> T>   place in vm_pager_get_pages().
>
> As Konstantin pointed out, it is very likely that right now NFS and SMD
> do the zeroing of partially valid pages incorrecrtly. So this code
> is removed from the generic vm_pager_get_pages() and this reduces
> step 2 patch to a very small change:
>
> o Uninline vm_pager_get_pages() into vm_pager.c.
> o Keep all KASSERTs that enforce the KPI of pagers and of their
>   consumers in one place: vm_pager_get_pages().
> o Remove some KASSERTs from pagers that are already there in
>   vm_pager_get_pages().
>
> Patch attached.
>
> --
> Totus tuus, Glebius.

> Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c
> ===================================================================
> --- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (revision 284504)
> +++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c (working copy)
> @@ -5734,8 +5734,6 @@ zfs_getpages(struct vnode *vp, vm_page_t *m, int c
>   object = mreq->object;
>   error = 0;
>  
> - KASSERT(vp->v_object == object, ("mismatching object"));
> -
>   if (pcount > 1 && zp->z_blksz > PAGESIZE) {
>   startoff = rounddown(IDX_TO_OFF(mreq->pindex), zp->z_blksz);
>   reqstart = OFF_TO_IDX(round_page(startoff));
> Index: sys/fs/nfsclient/nfs_clbio.c
> ===================================================================
> --- sys/fs/nfsclient/nfs_clbio.c (revision 284504)
> +++ sys/fs/nfsclient/nfs_clbio.c (working copy)
> @@ -129,12 +129,6 @@ ncl_getpages(struct vop_getpages_args *ap)
>   npages = btoc(count);
>  
>   /*
> - * Since the caller has busied the requested page, that page's valid
> - * field will not be changed by other threads.
> - */
> - vm_page_assert_xbusied(pages[ap->a_reqpage]);
> -
> - /*
>   * If the requested page is partially valid, just return it and
>   * allow the pager to zero-out the blanks.  Partially valid pages
>   * can only occur at the file EOF.
> Index: sys/vm/swap_pager.c
> ===================================================================
> --- sys/vm/swap_pager.c (revision 284504)
> +++ sys/vm/swap_pager.c (working copy)
> @@ -1118,10 +1118,6 @@ swap_pager_getpages(vm_object_t object, vm_page_t
>  
>   mreq = m[reqpage];
>  
> - KASSERT(mreq->object == object,
> -    ("swap_pager_getpages: object mismatch %p/%p",
> -    object, mreq->object));
> -
>   /*
>   * Calculate range to retrieve.  The pages have already been assigned
>   * their swapblks.  We require a *contiguous* range but we know it to
> Index: sys/vm/vm_pager.c
> ===================================================================
> --- sys/vm/vm_pager.c (revision 284504)
> +++ sys/vm/vm_pager.c (working copy)
> @@ -251,7 +251,88 @@ vm_pager_deallocate(object)
>  }
>  
>  /*
> - * vm_pager_get_pages() - inline, see vm/vm_pager.h
> + * Retrieve pages from the VM system in order to map them into an object
> + * ( or into VM space somewhere ).  If the pagein was successful, we
> + * must fully validate it.
> + */
> +int
> +vm_pager_get_pages(vm_object_t object, vm_page_t *m, int count, int reqpage)
> +{
> + int r;
> +
> + VM_OBJECT_ASSERT_WLOCKED(object);
> + KASSERT(count > 0, ("%s: 0 count", __func__));
> +
> +#ifdef INVARIANTS
> + /*
> + * All pages must be busied, not mapped, not valid (save the last one),
> + * not dirty and belong to the proper object.
> + */
> + for (int i = 0 ; i < count; i++) {
> + vm_page_assert_xbusied(m[i]);
> + KASSERT(!pmap_page_is_mapped(m[i]),
> +    ("%s: page %p is mapped", __func__, m[i]));
> + KASSERT(m[i]->valid == 0 || i == count - 1,
> +    ("%s: request for a valid page %p", __func__, m[i]));
Are you sure that e.g. the requested page cannot be partially valid ?
I am surprised.

> + KASSERT(m[i]->dirty == 0,
> +    ("%s: page %p is dirty", __func__, m[i]));
> + KASSERT(m[i]->object == object,
> +    ("%s: wrong object %p/%p", __func__, object, m[i]->object));
> + }
> +#endif
Create a helper function with the loop, and use it, instead of copying
the block twice.

> +
> + r = (*pagertab[object->type]->pgo_getpages)(object, m, count, reqpage);
> + if (r != VM_PAGER_OK)
> + return (r);
> +
> + /*
> + * If pager has replaced the page, assert that it had
> + * updated the array.
> + */
> + KASSERT(m[reqpage] == vm_page_lookup(object, m[reqpage]->pindex),
> +    ("%s: mismatch page %p pindex %ju", __func__,
> +    m[reqpage], (uintmax_t )m[reqpage]->pindex - 1));
Why -1 ?
Need an empty line after assert, to deliniate the code which is commented
about.

Also, you may assert that the requested page is still busied.

> + /*
> + * Pager didn't fill up entire page.  Zero out
> + * partially filled data.
> + */
> + if (m[reqpage]->valid != VM_PAGE_BITS_ALL)
> + vm_page_zero_invalid(m[reqpage], TRUE);
> +
> + return (VM_PAGER_OK);
> +}
> +
> +int
> +vm_pager_get_pages_async(vm_object_t object, vm_page_t *m, int count,
> +    int reqpage, pgo_getpages_iodone_t iodone, void *arg)
> +{
> +
> + VM_OBJECT_ASSERT_WLOCKED(object);
> + KASSERT(count > 0, ("%s: 0 count", __func__));
> +
> +#ifdef INVARIANTS
> + /*
> + * All pages must be busied, not mapped, not valid (save the last one),
> + * not dirty and belong to the proper object.
> + */
> + for (int i = 0 ; i < count; i++) {
> + vm_page_assert_xbusied(m[i]);
> + KASSERT(!pmap_page_is_mapped(m[i]),
> +    ("%s: page %p is mapped", __func__, m[i]));
> + KASSERT(m[i]->valid == 0 || i == count - 1,
> +    ("%s: request for a valid page %p", __func__, m[i]));
> + KASSERT(m[i]->dirty == 0,
> +    ("%s: page %p is dirty", __func__, m[i]));
> + KASSERT(m[i]->object == object,
> +    ("%s: wrong object %p/%p", __func__, object, m[i]->object));
> + }
> +#endif
> +
> + return ((*pagertab[object->type]->pgo_getpages_async)(object, m,
> +    count, reqpage, iodone, arg));
> +}
> +
> +/*
>   * vm_pager_put_pages() - inline, see vm/vm_pager.h
>   * vm_pager_has_page() - inline, see vm/vm_pager.h
>   */
> Index: sys/vm/vm_pager.h
> ===================================================================
> --- sys/vm/vm_pager.h (revision 284504)
> +++ sys/vm/vm_pager.h (working copy)
> @@ -106,9 +106,9 @@ vm_object_t vm_pager_allocate(objtype_t, void *, v
>      vm_ooffset_t, struct ucred *);
>  void vm_pager_bufferinit(void);
>  void vm_pager_deallocate(vm_object_t);
> -static __inline int vm_pager_get_pages(vm_object_t, vm_page_t *, int, int);
> -static inline int vm_pager_get_pages_async(vm_object_t, vm_page_t *, int,
> -    int, pgo_getpages_iodone_t, void *);
> +int vm_pager_get_pages(vm_object_t, vm_page_t *, int, int);
> +int vm_pager_get_pages_async(vm_object_t, vm_page_t *, int, int,
> +    pgo_getpages_iodone_t, void *);
>  static __inline boolean_t vm_pager_has_page(vm_object_t, vm_pindex_t, int *, int *);
>  void vm_pager_init(void);
>  vm_object_t vm_pager_object_lookup(struct pagerlst *, void *);
> @@ -115,40 +115,6 @@ vm_object_t vm_pager_object_lookup(struct pagerlst
>  void vm_pager_free_nonreq(vm_object_t object, vm_page_t ma[], int reqpage,
>      int npages, boolean_t object_locked);
>  
> -/*
> - * vm_page_get_pages:
> - *
> - * Retrieve pages from the VM system in order to map them into an object
> - * ( or into VM space somewhere ).  If the pagein was successful, we
> - * must fully validate it.
> - */
> -static __inline int
> -vm_pager_get_pages(
> - vm_object_t object,
> - vm_page_t *m,
> - int count,
> - int reqpage
> -) {
> - int r;
> -
> - VM_OBJECT_ASSERT_WLOCKED(object);
> - r = (*pagertab[object->type]->pgo_getpages)(object, m, count, reqpage);
> - if (r == VM_PAGER_OK && m[reqpage]->valid != VM_PAGE_BITS_ALL) {
> - vm_page_zero_invalid(m[reqpage], TRUE);
> - }
> - return (r);
> -}
> -
> -static inline int
> -vm_pager_get_pages_async(vm_object_t object, vm_page_t *m, int count,
> -    int reqpage, pgo_getpages_iodone_t iodone, void *arg)
> -{
> -
> - VM_OBJECT_ASSERT_WLOCKED(object);
> - return ((*pagertab[object->type]->pgo_getpages_async)(object, m,
> -    count, reqpage, iodone, arg));
> -}
> -
>  static __inline void
>  vm_pager_put_pages(
>   vm_object_t object,

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Step 2 Was: more strict KPI for vm_pager_get_pages()

Gleb Smirnoff
  Konstantin,

On Wed, Jun 17, 2015 at 04:45:38PM +0300, Konstantin Belousov wrote:
K> > +++ sys/vm/vm_pager.c (working copy)
K> > @@ -251,7 +251,88 @@ vm_pager_deallocate(object)
K> >  }
K> >  
K> >  /*
K> > - * vm_pager_get_pages() - inline, see vm/vm_pager.h
K> > + * Retrieve pages from the VM system in order to map them into an object
K> > + * ( or into VM space somewhere ).  If the pagein was successful, we
K> > + * must fully validate it.
K> > + */
K> > +int
K> > +vm_pager_get_pages(vm_object_t object, vm_page_t *m, int count, int reqpage)
K> > +{
K> > + int r;
K> > +
K> > + VM_OBJECT_ASSERT_WLOCKED(object);
K> > + KASSERT(count > 0, ("%s: 0 count", __func__));
K> > +
K> > +#ifdef INVARIANTS
K> > + /*
K> > + * All pages must be busied, not mapped, not valid (save the last one),
K> > + * not dirty and belong to the proper object.
K> > + */
K> > + for (int i = 0 ; i < count; i++) {
K> > + vm_page_assert_xbusied(m[i]);
K> > + KASSERT(!pmap_page_is_mapped(m[i]),
K> > +    ("%s: page %p is mapped", __func__, m[i]));
K> > + KASSERT(m[i]->valid == 0 || i == count - 1,
K> > +    ("%s: request for a valid page %p", __func__, m[i]));
K> Are you sure that e.g. the requested page cannot be partially valid ?
K> I am surprised.

All save the last one are always valid. This passed stress2 for me and pho@.

K> > + KASSERT(m[i]->dirty == 0,
K> > +    ("%s: page %p is dirty", __func__, m[i]));
K> > + KASSERT(m[i]->object == object,
K> > +    ("%s: wrong object %p/%p", __func__, object, m[i]->object));
K> > + }
K> > +#endif
K> Create a helper function with the loop, and use it, instead of copying
K> the block twice.

Done.

K> > +
K> > + r = (*pagertab[object->type]->pgo_getpages)(object, m, count, reqpage);
K> > + if (r != VM_PAGER_OK)
K> > + return (r);
K> > +
K> > + /*
K> > + * If pager has replaced the page, assert that it had
K> > + * updated the array.
K> > + */
K> > + KASSERT(m[reqpage] == vm_page_lookup(object, m[reqpage]->pindex),
K> > +    ("%s: mismatch page %p pindex %ju", __func__,
K> > +    m[reqpage], (uintmax_t )m[reqpage]->pindex - 1));
K> Why -1 ?
K> Need an empty line after assert, to deliniate the code which is commented
K> about.

That's an artifact from future patch. Fixed.

K> Also, you may assert that the requested page is still busied.

Done.

Updated patch attached.

--
Totus tuus, Glebius.

_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"

pagers.step.2.diff (6K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Step 2 Was: more strict KPI for vm_pager_get_pages()

Konstantin Belousov
In reply to this post by Gleb Smirnoff
On Tue, Jun 16, 2015 at 12:29:31AM +0300, Gleb Smirnoff wrote:

>   Hi!
>
>   This is step 2 of the "more strict pager KPI" patch:
>
> o Uninline vm_pager_get_pages() into vm_pager.c.
> o Keep all KASSERTs that enforce the KPI of pagers and of their
>   consumers in one place: vm_pager_get_pages().
> o Keep the code that zeroes out partially valid pages in one
>   place in vm_pager_get_pages().
>
I added Rick to Cc:, since there is something which I do not quite
understand in the NFS client code. According to NFS v3 RFC, server may
reply with the short read for a read RPC. In case the EOF flag is not
set in the reply, this means that the data is available, and it is a
transient server condition that reply was truncated.

Do we handle the short reads ?  I see the code in nfs read path which
zeroes the absent parts of the buffer.  Similarly, there is a code to
zero the unread part of the page after VOP_GETPAGES().  But couldn't
this result in the zeros instead of real data after the short read ?

At least, I was not able to find code which would retry after reply
without EOF.
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-arch
To unsubscribe, send any mail to "[hidden email]"