find vs ls performance for walking folders, are there any faster options?

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

find vs ls performance for walking folders, are there any faster options?

Olav Grønås Gjerde
I'm working on scanning filesystems to build a file search engine and
came over something interesting.

I can walk through 300 000 folders in ~19.5seconds with this command:
ls -Ra | grep -e "./.*:" | sed "s/://"

With find, it surprisingly takes ~50.5 seconds.:
find . -type d

My results are based on five runs of each command to warm up the disk cache.
I've tried both this with both UFS and ZFS, and both filesystems shows
the same speed difference.
On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just
slight faster than find(about 15-20%).

Are there a faster way to walk folders on FreeBSD? Are there some
options(sysctl) I could tune to improve the performance?
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: find vs ls performance for walking folders, are there any faster options?

Bruce Evans-4
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"
Reply | Threaded
Open this post in threaded view
|

Re: find vs ls performance for walking folders, are there any faster options?

Olav Grønås Gjerde
Thank you, that was a really good answer.

On Wed, Dec 12, 2012 at 3:50 PM, Bruce Evans <[hidden email]> wrote:

> On Wed, 12 Dec 2012, [ISO-8859-1] Olav Grønås Gjerde wrote:
>
>> I'm working on scanning filesystems to build a file search engine and
>> came over something interesting.
>>
>> I can walk through 300 000 folders in ~19.5seconds with this command:
>> ls -Ra | grep -e "./.*:" | sed "s/://"
>>
>> With find, it surprisingly takes ~50.5 seconds.:
>> find . -type d
>
>
> This is because 'find' with '-type' lstats all the files.  It doesn't
> use DT_DIR from dirent for some reason.  ls can be slowed down similarly
> using -F.
>
>
>> My results are based on five runs of each command to warm up the disk
>> cache.
>> I've tried both this with both UFS and ZFS, and both filesystems shows
>> the same speed difference.
>
>
> I get almost exactly the same ratio of speeds on an old version of FreeBSD.
> All the data was cached, and there were only 7 symlinks.  Thr file system
> was mounted with -noatime, so the cache actually worked.
>
>
>> On a modern Linux distribution(Ubuntu 12.10 with EXT4), ls is just
>> slight faster than find(about 15-20%).
>
>
> Apparently lstat() is relatively much slower in FreeBSD.  It only takes
> 5 usec here, but that is a lot for converting cached data (getpid()
> takes 0.2 usec).  A file system mounted with -atime might be much
> slower, for writing directory timestamps (the sync of the timestamps
> is delayed, but it is a very heavyweight operation).
>
>
>> Are there a faster way to walk folders on FreeBSD? Are there some
>> options(sysctl) I could tune to improve the performance?
>
>
> Nothing much faster than find without -type.  Whatever fts(3) gives.
>
> Bruce
_______________________________________________
[hidden email] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-performance
To unsubscribe, send any mail to "[hidden email]"