Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

We have encountered an unexpected performance issue when traversing directories looking for files using a wildcard pattern.

We have 180 folders each containing 10,000 files. A command line search using dir <pattern> /s completes almost instantly (<0.25 second). However, from our application the same search takes between 3-4 seconds.

We initially tried using System.IO.DirectoryInfo.GetFiles() with SearchOption.AllDirectories and have now tried the Win32 API calls FindFirstFile() and FindNextFile().

Profiling our code using indicates that the vast majority of execution time is spent on these calls.

Our code is based on the following blog post:

http://codebetter.com/blogs/matthew.podwysocki/archive/2008/10/16/functional-net-fighting-friction-in-the-bcl-with-directory-getfiles.aspx

We found this to be slow so updated the GetFiles function to take a string search pattern rather than a predicate.

Can anyone shed any light on what might be wrong with our approach?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
493 views
Welcome To Ask or Share your Answers For Others

1 Answer

In my tests using FindFirstFileEx with FindExInfoBasic and FIND_FIRST_EX_LARGE_FETCH is much faster than the plain FindFirstFile.

Scanning 20 folders with ~300,000 files took 661 seconds with FindFirstFile and 11 seconds with FindFirstFileEx. Subsequent calls to the same folders took less than a second.

HANDLE h=FindFirstFileEx(search.c_str(), FindExInfoBasic, &data, FindExSearchNameMatch, NULL, FIND_FIRST_EX_LARGE_FETCH); 

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...