[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

            Bug ID: 218622
           Summary: libc/gen/telldir [hack-n-PATCH]  performance limited
                    to O(n) vs file count, O(n^2) against samba ls
                    workload
           Product: Base System
           Version: 11.0-STABLE
          Hardware: Any
                OS: Any
            Status: New
          Severity: Affects Many People
          Priority: ---
         Component: standards
          Assignee: [hidden email]
          Reporter: [hidden email]

We have been tracking a performance regression in FreeBSD 11 stable based smbd
pegs a cpu and takes much longer to list huge directories than it's FreeBSD 9.x
base counterpart .  Profiling showed that the time was geometrically related to
file count within the directory.

Samba emits a telldir for every file during a readdir; and in 9.x that seemed
to run in linear time and things where fine; however this change appears to
expands the workload of list scan to O(n^2) vs file count.
https://svnweb.freebsd.org/base/stable/11/lib/libc/gen/telldir.c?r1=235647&r2=269204

For a directory with 64k files, the performance is as follows when driven by
samba listing files. Identical hardware and zfs data sets are used:

using dtrace script:
BEGIN { printf("thinking, hit control-c when you are tired of it");}

pid$1::$2:entry
        { self->st= timestamp; }
pid$1::$2:return
        {
        @[execname,"delta(ns)" ] = quantize(  timestamp - self->st);
        self->st = 0;
        }

9.3:
#./dt_time_in_pid_func.dt `top -b | head -10 | tail -1 | chomp | cut -w -f1`
telldir
dtrace: script './dt_time_in_pid_func.dt' matched 3 probes
CPU     ID                    FUNCTION:NAME
  7      1                           :BEGIN thinking, hit control-c when you
are tired of it
^C

  smbd                                                delta(ns)                
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@  53883    
            8192 |@                                        730      
           16384 |                                         502    

and under 11-stable(ish):
 #dtrace -s dtprofile.is_in_path.dt `top -b | head -10 | tail -1 | chomp | cut
-w -f1` telldir  
dtrace: script 'dtprofile.is_in_path.dt' matched 2 probes
^C

  deltans                                          
           value  ------------- Distribution ------------- count    
            2048 |                                         0        
            4096 |                                         90      
            8192 |                                         8        
           16384 |                                         0        
           32768 |                                         0        
           65536 |@                                        1583    
          131072 |@@@@@@@@@@                               12270    
          262144 |@@@@@@@@@@@@@@@@@@@@@@                   28159    <<- libc
telldir takes how long now?!
          524288 |@@@@@@@                                  9299    
         1048576 |      



After reverting the telldir change shamelessly:
https://github.com/freenas/os/commit/92873f3190c830302143d759411b23bd719b0ba2

Performance for the telldir returned to constant time.

The change appears important to @standards however the impact is tough to
explain to samba users.  To conjecture, I wonder if a run time tunable could
select the 'conforming' or 'fast' behaviour for telldir like  LD_PRELOAD...
style directives.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

Mark Linimon <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[hidden email],
                   |                            |[hidden email]

--- Comment #1 from Mark Linimon <[hidden email]> ---
Notify committer and reviewer of the rather-old r269204 commit (itself based on
PR 121656, Phabricator https://phabric.freebsd.org/D490.)

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

--- Comment #2 from John Baldwin <[hidden email]> ---
So this wasn't purely about standards@ compliance, but there was actual
software that was broken on FreeBSD because of this (it might have involved
using telldir() on a large directory accessed via NFS where the client broke.
If it's the thing I'm thinking of then you would have an 'ls' on a large NFS
directory that would never complete).  The larger fix is to change
getdirentries() to report a valid seek location with each 'struct dirent'.
That depends on changing the contents of 'struct dirent' which is a non-trivial
thing to do, but is included in the ongoing 'ino64' work.  Once that is in
place you don't need the telldir cookies at all.  One thing you could do for
now is replace the linear O(n) search with something better like a tree that
you can do a binary search on.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

Alexander Motin <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|New                         |In Progress
                 CC|                            |[hidden email]

--- Comment #3 from Alexander Motin <[hidden email]> ---
I propose such a patch for this problem: https://reviews.freebsd.org/D10408 .
On my synthetic tests it shows great results with minimal additional
complexity.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

--- Comment #4 from ash <[hidden email]> ---
nice..

Effect under smbd workload is good.

smbd                                                delta(ns)                  
           value  ------------- Distribution ------------- count    
            1024 |                                         0        
            2048 |@@@@@@@@                         10912    
            4096 |@@@@@@@@@@@                  15434    
            8192 |                                         221      
...

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

--- Comment #5 from [hidden email] ---
A commit references this bug:

Author: mav
Date: Mon Apr 17 19:03:31 UTC 2017
New revision: 317064
URL: https://svnweb.freebsd.org/changeset/base/317064

Log:
  Optimize pathologic case of telldir() for Samba.

  When application reads large directory, calling telldir() for each entry,
  like Samba does, it creates exponential performance drop as number of
  entries reach tenths to hundreds of thousands.  It is caused by full search
  through the internal list, that never finds matches in that scenario, but
  creates O(n^2) delays.  This patch optimizes that search, limiting it to
  entries of the same buffer, turning time closer to O(n) in case of linear
  directory scan.

  PR:           218622
  Reviewed by:  jhb, jilles
  MFC after:    2 weeks
  Sponsored by: iXsystems, Inc.
  Differential Revision:        https://reviews.freebsd.org/D10408

Changes:
  head/lib/libc/gen/telldir.c

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Bug 218622] libc/gen/telldir [hack-n-PATCH] performance limited to O(n) vs file count, O(n^2) against samba ls workload

bugzilla-noreply
In reply to this post by bugzilla-noreply
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=218622

Alexander Motin <[hidden email]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|In Progress                 |Closed
         Resolution|---                         |FIXED

--- Comment #6 from Alexander Motin <[hidden email]> ---
I've merged the patch into 11 and 10 stable.

--
You are receiving this mail because:
You are the assignee for the bug.
_______________________________________________
[hidden email] mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-standards
To unsubscribe, send any mail to "[hidden email]"
Loading...