collier@ariel.unm.edu (uncia uncia) (11/10/89)
here at the UNM computing center, as part of our accounting/capacity statistics stream, we run a "quot(8)" command every day on all file systems on all machines to determine disk space usage for all the users on our systems. we have noticed a problem. on our Sequent the quot command takes *forever* to run, while on all our other machines it finishes in a short while. here are some sample figures from yesterday: on a Vax 11/785, running 4.3bsd the quot took: 169.8 cpu seconds (user + system) for 1063 MBytes in 18 file systems, all on Fujitsu Eagles (takes less than 20 minutes in wallclock time to run). on a Sequent S27, 6 processors, running Dynix 3.0.12 the quot took: 46754.6 cpu seconds (user + system) for 800 MBytes in 11 file systems, all on Fujitsu Eagles & Swallows (takes almost a full day in wallclock time to run). this has been going on for a while now. the file systems are not any fuller (and therefore perhaps more fragmented) on the Sequent. in fact, the truth is to the contrary. quot is not run on any NFS file system: all file systems are on disks attached to the machines themselves. i have done diffs on the source to quot(8) as provided by both Berkeley and Sequent, and there are no perceivable semantic differences. as well, the Sequent makes a similarly poor showing in this context against Vaxes running Ultrix, as well as Sun and Mips-based Dec workstations running their particular flavors of Unix. can anyone tell me why the Sequent system performs orders of magnitude worse than each of our other machines (or for that matter, all of our other machines *combined*)? the inferences concerning relative file system performance that can be drawn from these observations are rather disturbing. any help would be most appreciated. -- Michael Collier University of New Mexico Computing Center collier@ariel.unm.edu 2701 Campus Blvd. (505) 277 8039 Albuquerque, NM 87131 (Home: 1160 Don Pasqual NW Los Lunas, NM 87031) ...!cmcl2!beta!unm-la!unmvax!charon!collier
collier@ariel.unm.edu (uncia uncia) (11/11/89)
problem solved: Dynix quot *does* hash uid's, linearly in a table 4096 entries big. this means that uid's bigger than 4096 (most of them, on our machine) don't get hashed at all. given that we can't get YP to run for more than a few hours on our Sequent without it hanging (which means we don't run it at all while we debug *it*), things take a lot longer than they should :-). a trivial fix is simply to increase the size of the basic hash table to 65535. the same code runs in < 60 cpu seconds (down from ~45K seconds). uses a little more virtual memory, but we can live with that at 2am, until YP is fixed. much thanks to vic@mentor.cc.purdue.edu for hitting me over the head with a big board and clearing up my tunnel vision! he also gave me some nifty diffs to use in case you don't want to use that much memory. -- Michael Collier University of New Mexico Computing Center collier@ariel.unm.edu 2701 Campus Blvd. (505) 277 8039 Albuquerque, NM 87131 (Home: 1160 Don Pasqual NW Los Lunas, NM 87031) ...!cmcl2!beta!unm-la!unmvax!charon!collier