[comp.arch] VM vs Supercomputers

andy@carcoar.Stanford.EDU (Andy Freeman) (06/09/88)

I'd like to get some data from this round of the biannual "why VM is
no good for supercomputers" topic.  The supercomputer people aren't
stupid, but they aren't real fond (at least not on comp.arch) of
telling us what supports their position either.

I'll define some variables, make some claims about their relative
values, and the rest of you can tell me their real values in
supercomputer environments.
	
Ar    - cost of the bit bashing required by the application on a machine
	without VM hardware assuming infinite memory
Tr    - runtime cost to make finite memory behave like infinite memory
	for the application on a machine without VM hardware
Adr   - cost of developing the bit bashing program whose runtime cost is
	Ar
Tdr   - is to Tr as Adr is to Ar

ArVM, TrVM, AdrVM, and TdrVM are the analogous costs using VM
hardware.  (ArVM is without paging, so the difference between it and
Ar reflects the overhead of VM capability.)

The cost of running the application on the machine without VM hardware
is Ar + Tr, but the total cost of that application is Ar + Tr + Adr +
Tdr.  The VM computer has analogous cost equations.

Let's assume that Adr = AdrVM and Ar < ArVM.  Furthermore, Tr < TrVM
(with competent programing, otherwise all bets are off) so the
application's runtime should be lower without VM.  However, TdrVM <
Tdr, so comparing the total costs is a more complicated.

Since the Los Alamos people edit on Crays, they think that development
costs are a healthy fraction of an application's total cost.  (If your
users don't cost more than the appropriate computers, you should get
better users, unless they're grad students.)  The development costs
might even exceed the runtime costs since people waiting for a six (or
twenty) hour run can do something else while the developers are busy
during extra development time.  Furthermore, the marginal cost of
making someone/some group wait an extra half-hour (or three hours, for
the twenty hour run) may not be that high compared to the cost of an
extra hour of development time.  (The latter is just my guesstimate of
the amortized difference between Tdr and TdrVM.)

Giles' Ar = 0.9 * ArVM seems reasonable, but I'd like to read what the
architects say.  I suspect that a modest amount of TdrVM, say Tdr/4,
can push TrVM close to Tr.

So, what are typical values for all of these variables?  If absolute
values aren't available, how does the development and runtime costs
break down into Tdr/Adr and Tr/Ar and how does the total runtime cost
compare to the total development cost?

-andy
UUCP:  {arpa gateways, decwrl, uunet, rutgers}!polya.stanford.edu!andy
ARPA:  andy@polya.stanford.edu
(415) 329-1718/723-3088 home/cubicle