andy@carcoar.Stanford.EDU (Andy Freeman) (06/09/88)
I'd like to get some data from this round of the biannual "why VM is no good for supercomputers" topic. The supercomputer people aren't stupid, but they aren't real fond (at least not on comp.arch) of telling us what supports their position either. I'll define some variables, make some claims about their relative values, and the rest of you can tell me their real values in supercomputer environments. Ar - cost of the bit bashing required by the application on a machine without VM hardware assuming infinite memory Tr - runtime cost to make finite memory behave like infinite memory for the application on a machine without VM hardware Adr - cost of developing the bit bashing program whose runtime cost is Ar Tdr - is to Tr as Adr is to Ar ArVM, TrVM, AdrVM, and TdrVM are the analogous costs using VM hardware. (ArVM is without paging, so the difference between it and Ar reflects the overhead of VM capability.) The cost of running the application on the machine without VM hardware is Ar + Tr, but the total cost of that application is Ar + Tr + Adr + Tdr. The VM computer has analogous cost equations. Let's assume that Adr = AdrVM and Ar < ArVM. Furthermore, Tr < TrVM (with competent programing, otherwise all bets are off) so the application's runtime should be lower without VM. However, TdrVM < Tdr, so comparing the total costs is a more complicated. Since the Los Alamos people edit on Crays, they think that development costs are a healthy fraction of an application's total cost. (If your users don't cost more than the appropriate computers, you should get better users, unless they're grad students.) The development costs might even exceed the runtime costs since people waiting for a six (or twenty) hour run can do something else while the developers are busy during extra development time. Furthermore, the marginal cost of making someone/some group wait an extra half-hour (or three hours, for the twenty hour run) may not be that high compared to the cost of an extra hour of development time. (The latter is just my guesstimate of the amortized difference between Tdr and TdrVM.) Giles' Ar = 0.9 * ArVM seems reasonable, but I'd like to read what the architects say. I suspect that a modest amount of TdrVM, say Tdr/4, can push TrVM close to Tr. So, what are typical values for all of these variables? If absolute values aren't available, how does the development and runtime costs break down into Tdr/Adr and Tr/Ar and how does the total runtime cost compare to the total development cost? -andy UUCP: {arpa gateways, decwrl, uunet, rutgers}!polya.stanford.edu!andy ARPA: andy@polya.stanford.edu (415) 329-1718/723-3088 home/cubicle