Bug ID: 218622
Summary: libc/gen/telldir [hack-n-PATCH] performance limited
to O(n) vs file count, O(n^2) against samba ls
Product: Base System
Severity: Affects Many People
Assignee: [hidden email] Reporter: [hidden email]
We have been tracking a performance regression in FreeBSD 11 stable based smbd
pegs a cpu and takes much longer to list huge directories than it's FreeBSD 9.x
base counterpart . Profiling showed that the time was geometrically related to
file count within the directory.
#./dt_time_in_pid_func.dt `top -b | head -10 | tail -1 | chomp | cut -w -f1`
dtrace: script './dt_time_in_pid_func.dt' matched 3 probes
CPU ID FUNCTION:NAME
7 1 :BEGIN thinking, hit control-c when you
are tired of it
value ------------- Distribution ------------- count
2048 | 0
4096 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@ 53883
8192 |@ 730
16384 | 502
and under 11-stable(ish):
#dtrace -s dtprofile.is_in_path.dt `top -b | head -10 | tail -1 | chomp | cut
-w -f1` telldir
dtrace: script 'dtprofile.is_in_path.dt' matched 2 probes
Performance for the telldir returned to constant time.
The change appears important to @standards however the impact is tough to
explain to samba users. To conjecture, I wonder if a run time tunable could
select the 'conforming' or 'fast' behaviour for telldir like LD_PRELOAD...
--- Comment #2 from John Baldwin <[hidden email]> ---
So this wasn't purely about standards@ compliance, but there was actual
software that was broken on FreeBSD because of this (it might have involved
using telldir() on a large directory accessed via NFS where the client broke.
If it's the thing I'm thinking of then you would have an 'ls' on a large NFS
directory that would never complete). The larger fix is to change
getdirentries() to report a valid seek location with each 'struct dirent'.
That depends on changing the contents of 'struct dirent' which is a non-trivial
thing to do, but is included in the ongoing 'ino64' work. Once that is in
place you don't need the telldir cookies at all. One thing you could do for
now is replace the linear O(n) search with something better like a tree that
you can do a binary search on.
Optimize pathologic case of telldir() for Samba.
When application reads large directory, calling telldir() for each entry,
like Samba does, it creates exponential performance drop as number of
entries reach tenths to hundreds of thousands. It is caused by full search
through the internal list, that never finds matches in that scenario, but
creates O(n^2) delays. This patch optimizes that search, limiting it to
entries of the same buffer, turning time closer to O(n) in case of linear