tim@j.cc.purdue.edu (Timothy Lange) (12/15/88)
I am dealing with an user that has around 650 files in a subdirectory. We noticed that accessing the files at the bottom of the DIR listing is much slower than the files near the beginning. The performance really drops off at about 512th file. Now I know that many files in a directory is not good, so don't flame me. I also know that a directory entry take 32 bytes, a sector is 512 bytes, and on this machine a cluster is 4 sectors. He has 'BUFFERS=40' in his config.sys file. What is magical about 512 files? I cannot figure out why that number is significant (but it looks good!) The first 500 or so files are accessed quite rapidly, the rest have a terrible access time. In fact, for files of equal size, I can make a copy of the first ten files ten times quicker than the last ten. I would love some info that explains my file access times. Tim. -- Timothy Lange / Purdue University Computing Center / Mathematical Sciences Bldg West Lafayette, IN 47907 / 317-494-1787 / tim@j.cc.purdue.edu / CIS 75410,525
ward@chinet.chi.il.us (Ward Christensen) (12/17/88)
In article <8545@j.cc.purdue.edu> tim@j.cc.purdue.edu (Timothy Lange) writes: >I am dealing with an user that has around 650 files in a subdirectory. >We noticed that accessing the files at the bottom of the DIR listing >is much slower than the files near the beginning. The performance >really drops off at about the 512th file. That's an easy one. I have had it happen many times. (of course it wasn't easy at FIRST ;-). What is happening is the fragmentation of your directory resulting in many seeks while processing the directory. THe solution is very simple: fewer files; or more realistically for your application: compress the disk such as with Norton Advanced Utility's SD (Speed Disk). This brings the pieces of the directory together, and acts quite fast. Good luck!
del@Data-IO.COM (Erik Lindberg) (12/20/88)
In article <7192@chinet.chi.il.us> ward@chinet.chi.il.us (Ward Christensen) writes: >In article <8545@j.cc.purdue.edu> tim@j.cc.purdue.edu (Timothy Lange) writes: >>I am dealing with an user that has around 650 files in a subdirectory. >>We noticed that accessing the files at the bottom of the DIR listing >>is much slower than the files near the beginning. The performance >>really drops off at about the 512th file. > That's an easy one. I have had it happen many times. (of course it >wasn't easy at FIRST ;-). > What is happening is the fragmentation of your directory resulting in >many seeks while processing the directory. This might seem to be what is happening on your system, but it isn't right. I run a large disk cache on my system, the directories easily fit in cache memory. I have observed the same behaviour: a massive increase in file access time when the number of files in a subdirectory exceeds 512. And that is with *NO* physical disk activity at all. > THe solution is very simple: fewer files; or more realistically for your >application: compress the disk such as with Norton Advanced Utility's SD Fewer files would be nice if the application would not suffer for it. I doubt if SpeedDisk will provide significant improvements. -- del (Erik Lindberg) uw-beaver!tikal!pilchuck!del
greggt@VAX1.CC.UAKRON.EDU (Gregg Thompson) (12/21/88)
I have noticed on machines that have too many files in a directory that if you increase the BUFFERS in config.sys that the problem goes away. I usually set buffers to twice the files, and if there are a lot of files I will increase as far as 99 for buffers. -- To live is to die, to die is to live forever; GRegg Thompson Where will you spend eternity? greggt@vax1.cc.uakron.edu
simon@ms.uky.edu (Simon Gales) (12/21/88)
In article <8545@j.cc.purdue.edu> tim@j.cc.purdue.edu (Timothy Lange) writes: >I am dealing with an user that has around 650 files in a subdirectory. >We noticed that accessing the files at the bottom of the DIR listing >is much slower than the files near the beginning. The performance >really drops off at about the 512th file. Using norton's SD to de-fragment your disk mmay help, but that probably isn't the problem. DOS is having to search through a _lot_ of files to find the one it wants to open. If the directory is in your path, it should at least be the last one. If you aren't accessing all the files often, sort the directory so that the files you access most are at the beginning (with norton's ds). Try using FASTOPEN to cache your directory entries, make sure you make the cache size as big as your directory (FASTOPEN c=700). Best of all, try splitting the directory into two or more directories. -- /--------------------------------------------------------------------------\ Simon Gales@University of Ky UUCP: {rutgers, uunet}!ukma!simon Arpa: simon@ms.uky.edu MaBell: (606) 263-2285/257-3597 BitNet: simon@UKMA.BITNET
les@chinet.chi.il.us (Leslie Mikesell) (12/22/88)
In article <10723@s.ms.uky.edu> simon@ms.uky.edu (Simon Gales) writes: >DOS is having to search through a _lot_ of files >to find the one it wants to open. If the directory is in your path, >it should at least be the last one. Is there any way to avoid having DOS search your current directory (i.e. only look in the PATH) for programs? This is especially a problem when working in large directories over a network. Les Mikesell
toma@tekgvs.GVS.TEK.COM (Tom Almy) (12/23/88)
In article <8545@j.cc.purdue.edu> tim@j.cc.purdue.edu (Timothy Lange) writes: >I am dealing with an user that has around 650 files in a subdirectory. >We noticed that accessing the files at the bottom of the DIR listing >is much slower than the files near the beginning. The performance >really drops off at about the 512th file. I have performed tests on my machine and found out that directory searching performance drops off dramatically (*Orders of Magnitude*) when there are more than 12*(number of block buffers specified in config.sys) files in the directory. Three suggestions (in order of value): 1). Move the files into multiple subdirectories. 2). Use a disk caching program -- this will typically improve all disk performance, but the performance drop off is still there, just not as dramatic. 3). Increase the number of block buffers with the config.sys buffers= command. Beyond about 20, performance actually starts dropping, and you would need over 50! Tom Almy toma@tekgvs.TEK.COM Standard Disclaimers Apply
ward@chinet.chi.il.us (Ward Christensen) (12/23/88)
In article <1078@pilchuck.Data-IO.COM> del@pilchuck.Data-IO.COM (Erik Lindberg) writes about... (attributions getting too lengthy, Pruned) The original article wondered why a disk with >650 files/subdir was slow doing DIR and other things. I stated: >> What is happening is the fragmentation of your directory resulting in >>many seeks while processing the directory. while Erik stated "This might SEEM to be what is happening on your system, but it isn't right", and said that in a case where his directories completely fit in cache (eliminating seek time), he found "a massive increase in file access time when the number of files in a subdirectory exceeds 512." To defend my initial comment (but not say Erik is wrong the way he said I was): my Compuserve capture directory slowed WAY down at one time, and I finally traced it. Deleting files from the directory didn't help, so I looked - and found my directory in 3 extents "all across the disk" (say, at 1/4, 1/2, and 3/4 the way across). By copying the files in the 3rd extent to a temp directory, then manually patching the FAT to not point to the 3rd extent, then copying the file back - so I had the SAME NUMBER of files but in 2 extents instead of 3, the speed improvement was very significant. So, Erik in MY case, the NUMBER OF DIRECTORY EXTENTS was VERY significant - probably due to the slow seek time. In YOUR case it was the sheer number of files. We're both right - like a car that can mis-fire from either fuel or electrical problems, a hard disk can slow down from either too many files or too many directory extents (and probably more things). Happy Holly-Daze to you all.
dave@westmark.UUCP (Dave Levenson) (12/24/88)
In article <7244@chinet.chi.il.us>, les@chinet.chi.il.us (Leslie Mikesell) writes: > In article <10723@s.ms.uky.edu> simon@ms.uky.edu (Simon Gales) writes: > > >DOS is having to search through a _lot_ of files > >to find the one it wants to open. If the directory is in your path, > >it should at least be the last one. > > Is there any way to avoid having DOS search your current directory > (i.e. only look in the PATH) for programs? This is especially a problem > when working in large directories over a network. The only way in standard MS-DOS is to use complete pathnames for your executables. For example, don't say: DISKCOPY A: B: but instead, if your dos commands are loaded, for example, in C:\DOS, use the command: C:\DOS\DISKCOPY A: B: In this case, no search of the current directory or the PATH is made. This will speed things up if you use largs PATH values and of your current directory is large, or on a slow (e.g. network) device. It also makes you less likely to be hit by a trojan horse named DISKCOPY that someone sneaks into your current directory! -- Dave Levenson Westmark, Inc. The Man in the Mooney Warren, NJ USA {rutgers | att}!westmark!dave