[mod.computers.apollo] slow disks

lwk@CAEN.ENGIN.UMICH.EDU.UUCP (03/13/87)

We have discovered a curious phenomenon. A large Fortran job which is
run on two dn3000, one disked, the other diskless and booted off the 
former, seems to run faster on the diskless node. Here are the facts 
in the case:
1. Both are dn3000s with 4 Megs. The disk is an MSD-154.
2. The nodes are part of a large network
3. The data and object files reside on a second disk elswhere in the
   network. The job is theoretically cpu bound, however. 
4. The dspst shell command shows the disked node version uses 20% of 
   the CPU time. It also shows heavey winchester I/O which I assume
   is paging. The diskless version uses 95% of the CPU time.
5. The program is 6Mb, and is mostly large double precision arrays
6. When the program was run simultaneously on both nodes, the diskless
   version used ~20% of the CPU time and the disked version ran ~10%.
7. netsvc -p xxx executed during run (6) above didn't seem to effect 
   anything.
Why would a diskless node run faster than disked node when both are using
the same disk? This question is important and may effect future purchases.
Can anybody out there (Apollo?) tell me whats going on??
 Woody Kellum <lwk@caen.engin.umich.edu>

fischer-michael@YALE.ARPA.UUCP (03/24/87)

    Why would a diskless node run faster than disked node when both are using
    the same disk? This question is important and may effect future purchases.
    Can anybody out there (Apollo?) tell me whats going on??

Memory!  When running diskless, you have in effect 8MB of memory
-- 4 on the diskless node and 4 on its disked partner.  Fetching
a page from your partner's memory is much faster than bringing it
in from a local disk.  You mention that you are running a 6MB program
on a 4MB node; that will likely result in a lot of paging activity.
It sounds like your best bet is to get more memory.

--Mike Fischer
-------

ram-ashwin@YALE.ARPA.UUCP (03/24/87)

>   Why would a diskless node run faster than disked node when both are using
>   the same disk?

I suspect it's because in the former case you get twice the memory, since the
memory of the disked node effectively acts like a cache for paging off its
disk.  Since you're running a very large job, the paging overhead on a disked
node will kill you if you're also executing your job on the same node.

-- Ashwin Ram.

ARPA:    Ram-Ashwin@yale
UUCP:    {decvax,linus,seismo}!yale!Ram-Ashwin
BITNET:  Ram@yalecs

-------