Adding namecache entries outside of vfs_lookup and vn_open ?

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Adding namecache entries outside of vfs_lookup and vn_open ?

Alan Somers-2
It looks like lookup and open are the only common vops that create new
namecache entries.  At least, those are the only ones that set
MAKEENTRY in the cn_flags field.  However, fuse(4)'s create-like
operations (FUSE_CREATE, FUSE_SYMLINK, etc) all return enough
information to create a namecache entry for the newly created file.
As-is, an operation like FUSE_CREATE will almost always be followed up
by a FUSE_LOOKUP, necessitating an extra round-trip to userland.

Would it be possible and wise to add these newly created entries to
the namecache automatically?

-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Konstantin Belousov
On Sat, Mar 02, 2019 at 06:02:06PM -0700, Alan Somers wrote:
> It looks like lookup and open are the only common vops that create new
> namecache entries.  At least, those are the only ones that set
> MAKEENTRY in the cn_flags field.  However, fuse(4)'s create-like
> operations (FUSE_CREATE, FUSE_SYMLINK, etc) all return enough
> information to create a namecache entry for the newly created file.
> As-is, an operation like FUSE_CREATE will almost always be followed up
> by a FUSE_LOOKUP, necessitating an extra round-trip to userland.
In VFS, creation of the new file is done by VOP_CREATE() after negative
VOP_LOOKUP().   VOP_CREATE() returns the new vnode that is installed into
file.  [A flag VN_OPEN_NAMECACHE was added for vn_open_cred() which results
in created name entry insertion into namecache.  It was done to handle
very specific situation in core dump code, which is no longer relevant.
The flag is still there.]

Similar discussion occured some time ago.  I think that the current
selection of the cases where namecache entry is created, is optimized
for the scenario where extracting large tarball does not largely affect
the non-directory elements of the cache.  If you do such extraction,
it is unlikely that you will access most of the files shortly.

> Would it be possible and wise to add these newly created entries to
> the namecache automatically?
Not from VFS, but the policy can be overriden by the filesystem by inserting
the elements into cache from VOPs as it finds suitable.

Does FUSE cache vnodes ?  I would find aggressive caching on the kernel
side somewhat unexpected for it.

_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Alan Somers-2
On Sun, Mar 3, 2019 at 4:03 AM Konstantin Belousov <[hidden email]> wrote:

>
> On Sat, Mar 02, 2019 at 06:02:06PM -0700, Alan Somers wrote:
> > It looks like lookup and open are the only common vops that create new
> > namecache entries.  At least, those are the only ones that set
> > MAKEENTRY in the cn_flags field.  However, fuse(4)'s create-like
> > operations (FUSE_CREATE, FUSE_SYMLINK, etc) all return enough
> > information to create a namecache entry for the newly created file.
> > As-is, an operation like FUSE_CREATE will almost always be followed up
> > by a FUSE_LOOKUP, necessitating an extra round-trip to userland.
> In VFS, creation of the new file is done by VOP_CREATE() after negative
> VOP_LOOKUP().   VOP_CREATE() returns the new vnode that is installed into
> file.  [A flag VN_OPEN_NAMECACHE was added for vn_open_cred() which results
> in created name entry insertion into namecache.  It was done to handle
> very specific situation in core dump code, which is no longer relevant.
> The flag is still there.]
>
> Similar discussion occured some time ago.  I think that the current
> selection of the cases where namecache entry is created, is optimized
> for the scenario where extracting large tarball does not largely affect
> the non-directory elements of the cache.  If you do such extraction,
> it is unlikely that you will access most of the files shortly.
>
> > Would it be possible and wise to add these newly created entries to
> > the namecache automatically?
> Not from VFS, but the policy can be overriden by the filesystem by inserting
> the elements into cache from VOPs as it finds suitable.

So MAKEENTRY is just advisory, and there shouldn't be a problem with
inserting cache entries from fuse_nop_create even if MAKEENTRY isn't
set?  I might try that.  The penalty for not doing so is an extra trip
to userland, which is greater than the penalty for other file systems
not doing it.

>
> Does FUSE cache vnodes ?  I would find aggressive caching on the kernel
> side somewhat unexpected for it.

No, it just uses the regular vnode cache.  The unique things that it
does is it caches file attributes within the vnode, and the daemon can
request a timeout period for either the attr cache or the entry cache.
When the timeout expires, the kernel is supposed to purge (or ignore)
its cached values.

-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Konstantin Belousov
On Sun, Mar 03, 2019 at 09:02:07AM -0700, Alan Somers wrote:

> On Sun, Mar 3, 2019 at 4:03 AM Konstantin Belousov <[hidden email]> wrote:
> >
> > On Sat, Mar 02, 2019 at 06:02:06PM -0700, Alan Somers wrote:
> > > It looks like lookup and open are the only common vops that create new
> > > namecache entries.  At least, those are the only ones that set
> > > MAKEENTRY in the cn_flags field.  However, fuse(4)'s create-like
> > > operations (FUSE_CREATE, FUSE_SYMLINK, etc) all return enough
> > > information to create a namecache entry for the newly created file.
> > > As-is, an operation like FUSE_CREATE will almost always be followed up
> > > by a FUSE_LOOKUP, necessitating an extra round-trip to userland.
> > In VFS, creation of the new file is done by VOP_CREATE() after negative
> > VOP_LOOKUP().   VOP_CREATE() returns the new vnode that is installed into
> > file.  [A flag VN_OPEN_NAMECACHE was added for vn_open_cred() which results
> > in created name entry insertion into namecache.  It was done to handle
> > very specific situation in core dump code, which is no longer relevant.
> > The flag is still there.]
> >
> > Similar discussion occured some time ago.  I think that the current
> > selection of the cases where namecache entry is created, is optimized
> > for the scenario where extracting large tarball does not largely affect
> > the non-directory elements of the cache.  If you do such extraction,
> > it is unlikely that you will access most of the files shortly.
> >
> > > Would it be possible and wise to add these newly created entries to
> > > the namecache automatically?
> > Not from VFS, but the policy can be overriden by the filesystem by inserting
> > the elements into cache from VOPs as it finds suitable.
>
> So MAKEENTRY is just advisory, and there shouldn't be a problem with
> inserting cache entries from fuse_nop_create even if MAKEENTRY isn't
> set?  I might try that.  The penalty for not doing so is an extra trip
> to userland, which is greater than the penalty for other file systems
> not doing it.
There can be problems from the too aggressive caching.  See below.

>
> >
> > Does FUSE cache vnodes ?  I would find aggressive caching on the kernel
> > side somewhat unexpected for it.
>
> No, it just uses the regular vnode cache.  The unique things that it
> does is it caches file attributes within the vnode, and the daemon can
> request a timeout period for either the attr cache or the entry cache.
> When the timeout expires, the kernel is supposed to purge (or ignore)
> its cached values.

This is what I mean, e.g. one of the strategy there might be to reclaim
fuse vnode on inactivation.  This is very harsh, of course, but was done
by nullfs not too long time ago.

For less contrived example, on NFS with its relatively defined semantic,
caching on the client sometimes become problematic. AFAIR, nfs client
re-checks mtime in strategic places, and ensures close-to-open
consistency by always flushing attributes on close, at least for NFS v3.

I am somewhat surprised that for FUSE it is considered safe (and useful)
to cache at all.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Alan Somers-2
On Sun, Mar 3, 2019 at 9:25 AM Konstantin Belousov <[hidden email]> wrote:

>
> On Sun, Mar 03, 2019 at 09:02:07AM -0700, Alan Somers wrote:
> > On Sun, Mar 3, 2019 at 4:03 AM Konstantin Belousov <[hidden email]> wrote:
> > >
> > > On Sat, Mar 02, 2019 at 06:02:06PM -0700, Alan Somers wrote:
> > > > It looks like lookup and open are the only common vops that create new
> > > > namecache entries.  At least, those are the only ones that set
> > > > MAKEENTRY in the cn_flags field.  However, fuse(4)'s create-like
> > > > operations (FUSE_CREATE, FUSE_SYMLINK, etc) all return enough
> > > > information to create a namecache entry for the newly created file.
> > > > As-is, an operation like FUSE_CREATE will almost always be followed up
> > > > by a FUSE_LOOKUP, necessitating an extra round-trip to userland.
> > > In VFS, creation of the new file is done by VOP_CREATE() after negative
> > > VOP_LOOKUP().   VOP_CREATE() returns the new vnode that is installed into
> > > file.  [A flag VN_OPEN_NAMECACHE was added for vn_open_cred() which results
> > > in created name entry insertion into namecache.  It was done to handle
> > > very specific situation in core dump code, which is no longer relevant.
> > > The flag is still there.]
> > >
> > > Similar discussion occured some time ago.  I think that the current
> > > selection of the cases where namecache entry is created, is optimized
> > > for the scenario where extracting large tarball does not largely affect
> > > the non-directory elements of the cache.  If you do such extraction,
> > > it is unlikely that you will access most of the files shortly.
> > >
> > > > Would it be possible and wise to add these newly created entries to
> > > > the namecache automatically?
> > > Not from VFS, but the policy can be overriden by the filesystem by inserting
> > > the elements into cache from VOPs as it finds suitable.
> >
> > So MAKEENTRY is just advisory, and there shouldn't be a problem with
> > inserting cache entries from fuse_nop_create even if MAKEENTRY isn't
> > set?  I might try that.  The penalty for not doing so is an extra trip
> > to userland, which is greater than the penalty for other file systems
> > not doing it.
> There can be problems from the too aggressive caching.  See below.
>
> >
> > >
> > > Does FUSE cache vnodes ?  I would find aggressive caching on the kernel
> > > side somewhat unexpected for it.
> >
> > No, it just uses the regular vnode cache.  The unique things that it
> > does is it caches file attributes within the vnode, and the daemon can
> > request a timeout period for either the attr cache or the entry cache.
> > When the timeout expires, the kernel is supposed to purge (or ignore)
> > its cached values.
>
> This is what I mean, e.g. one of the strategy there might be to reclaim
> fuse vnode on inactivation.  This is very harsh, of course, but was done
> by nullfs not too long time ago.

Currently fuse doesn't do anything special when the timeout expires.
It only checks the timeout on lookup, and ignores the cached value if
the timeout has already expired.

>
> For less contrived example, on NFS with its relatively defined semantic,
> caching on the client sometimes become problematic. AFAIR, nfs client
> re-checks mtime in strategic places, and ensures close-to-open
> consistency by always flushing attributes on close, at least for NFS v3.
>
> I am somewhat surprised that for FUSE it is considered safe (and useful)
> to cache at all.

The daemon can choose the timeout period.  For local filesystems like
fusefs-ext2 it might set the timeout to infinity.  For simple network
filesystems like fusefs-sshfs it might set the timeout to 0, disabling
all kernel cacheing.  And for more sophisticated network filesystems
like an NFSv4 client might set the timeout to a finite non-zero time.
Later versions of the fuse protocol also allow the daemon to tell the
kernel to  immediately expire its cache.

-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Alan Somers-2
In reply to this post by Konstantin Belousov
On Sun, Mar 3, 2019 at 4:03 AM Konstantin Belousov <[hidden email]> wrote:

>
> On Sat, Mar 02, 2019 at 06:02:06PM -0700, Alan Somers wrote:
> > It looks like lookup and open are the only common vops that create new
> > namecache entries.  At least, those are the only ones that set
> > MAKEENTRY in the cn_flags field.  However, fuse(4)'s create-like
> > operations (FUSE_CREATE, FUSE_SYMLINK, etc) all return enough
> > information to create a namecache entry for the newly created file.
> > As-is, an operation like FUSE_CREATE will almost always be followed up
> > by a FUSE_LOOKUP, necessitating an extra round-trip to userland.
> In VFS, creation of the new file is done by VOP_CREATE() after negative
> VOP_LOOKUP().   VOP_CREATE() returns the new vnode that is installed into
> file.  [A flag VN_OPEN_NAMECACHE was added for vn_open_cred() which results
> in created name entry insertion into namecache.  It was done to handle
> very specific situation in core dump code, which is no longer relevant.
> The flag is still there.]
>
> Similar discussion occured some time ago.  I think that the current
> selection of the cases where namecache entry is created, is optimized
> for the scenario where extracting large tarball does not largely affect
> the non-directory elements of the cache.  If you do such extraction,
> it is unlikely that you will access most of the files shortly.

I don't understand this objection.  When you extract a tarball full of
non-empty files, don't you still need to open every file to write its
contents, creating a namecache entry for each one?

>
> > Would it be possible and wise to add these newly created entries to
> > the namecache automatically?
> Not from VFS, but the policy can be overriden by the filesystem by inserting
> the elements into cache from VOPs as it finds suitable.
>
> Does FUSE cache vnodes ?  I would find aggressive caching on the kernel
> side somewhat unexpected for it.
>
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Konstantin Belousov
On Mon, Mar 04, 2019 at 08:24:27AM -0700, Alan Somers wrote:

> On Sun, Mar 3, 2019 at 4:03 AM Konstantin Belousov <[hidden email]> wrote:
> > Similar discussion occured some time ago.  I think that the current
> > selection of the cases where namecache entry is created, is optimized
> > for the scenario where extracting large tarball does not largely affect
> > the non-directory elements of the cache.  If you do such extraction,
> > it is unlikely that you will access most of the files shortly.
>
> I don't understand this objection.  When you extract a tarball full of
> non-empty files, don't you still need to open every file to write its
> contents, creating a namecache entry for each one?
No, you don't.

Typically, when archiver parsed the stream and noted that there is a file
to create with a content, it
- opens the file, and gets the file descriptor returned to usermode.
  Internally, kernel does (vn_open_cred())
        namei() <- this call returns no vnode because the file is non-existent,
                   and does not create negative cache entry, see NOCACHE
                   argument for cn_flags.
        VOP_CREATE() <- creating the file, again not caching
        assign the vnode returned, to the file
- now the process has the descriptor for writes, but namecache entry is
  still not installed.
- content is written, file is closed.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: Adding namecache entries outside of vfs_lookup and vn_open ?

Alan Somers-2
On Mon, Mar 4, 2019 at 8:42 AM Konstantin Belousov <[hidden email]> wrote:

>
> On Mon, Mar 04, 2019 at 08:24:27AM -0700, Alan Somers wrote:
> > On Sun, Mar 3, 2019 at 4:03 AM Konstantin Belousov <[hidden email]> wrote:
> > > Similar discussion occured some time ago.  I think that the current
> > > selection of the cases where namecache entry is created, is optimized
> > > for the scenario where extracting large tarball does not largely affect
> > > the non-directory elements of the cache.  If you do such extraction,
> > > it is unlikely that you will access most of the files shortly.
> >
> > I don't understand this objection.  When you extract a tarball full of
> > non-empty files, don't you still need to open every file to write its
> > contents, creating a namecache entry for each one?
> No, you don't.
>
> Typically, when archiver parsed the stream and noted that there is a file
> to create with a content, it
> - opens the file, and gets the file descriptor returned to usermode.
>   Internally, kernel does (vn_open_cred())
>         namei() <- this call returns no vnode because the file is non-existent,
>                    and does not create negative cache entry, see NOCACHE
>                    argument for cn_flags.
>         VOP_CREATE() <- creating the file, again not caching
>         assign the vnode returned, to the file
> - now the process has the descriptor for writes, but namecache entry is
>   still not installed.
> - content is written, file is closed.

Ok, that make sense.  So I guess the problem only really applies to
filetypes like symlinks that can't create-and-open.  But in the
tarball case, you wouldn't need to access the symlink again anyway.
-Alan
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[hidden email]"