jon@cit-vax (Jonathan P. Leech) (04/21/85)
I took advantage of the Celerity demonstration mentioned last week and brought up mined, a full-screen editor written at Caltech consisting of roughly 20,000 lines of C. The only problem encountered was #including a file needed by <sys/proc.h> which required adding a -D flag to the compilation. The editor seems to operate perfectly. In terms of the machine's performance, the following compilation times and image sizes may be of interest (compared to machines at Caltech). Machine Compilation time Program Size user system total text data bss total(dec) Sun 2/4.2 BSD 2404 + 1612 4016 268288 38912 11640 318840 Celerity 1489 + 513 2002 544768 40960 27888 613616 VAX 780/4.2 BSD 1268 + 196 1464 221184 35840 28408 285432 I find the claimed 2x780 performance unlikely in view of these results (admittedly measuring only compilation speed). Also, this machine seems to be a RISC, judging from the size of code produced and a cursory look at the output of cc -S; alternatively, the -O switch to cc invokes a DE-optimizer. Does anyone know for sure? Thanks to RIACS for the chance to evaluate the machine. Jon Leech jon@cit-vax.arpa __@/
dave@RIACS.ARPA (Dave Gehrt) (04/21/85)
I had a discussion with the folks at Celerity on Friday, and was supposed to add to the motd that the system is currently using a couple of 120 MB 5-1/4 inch winchesters for disks. They are considering adding an eagle to the system, but I am not sure when that may happen if at all. I don't have any idea about the actual performance on those disks, but they say the difference in seek time is 30 ms vice 18 for an eagle. The transfer rate would be higher also on a multibus disk. If that disk swap happens I'll let you know. Also, I have a recollection that they also said something about a speeded up version of the C compiler being in the works. The current version is a pcc based version. Thank you for taking the time to reply. dave ----------
hammond@petrus.UUCP (04/23/85)
I have done a fair amount of simple benchmarks on a Celerity C1200, Pyramid 90x, Vax 780, and Vax 785, to compare performance of the CPUs. The machines all had optional floating point accelerators, the Pyramid also had a data cache option. The basic results: For double precision floating point in C (using register double variables, which the 4.2 BSD and Pyramid appear to equate to double variables), I can confirm that the Celerity C1200 appears to be 2 times an 11/780 w/FPA. That makes it the fastest floating point of the 4 types tested. I also, at least on the trivial integer benchmarks we tested, can say that the basic CPU for integer aritmetic appears to be about 3 times an 11/780 or roughly the same as a Pyramid 90x. Disk Performance: Although my trivial benchmarks took almost the same amount of CPU (using their new, faster cc) as the Pyramid, they took 3 times as long in real time. Our Pyramid has eagles, the Celerity had the slower 120Mb disks. I don't know what improvement an eagle would make. Flies in the ointment: The Celerity is a Fortran machine, it has a stack register array (I'd call it a cache, but caches in my view empty/fill automagically and this doesn't) of 16 levels. If your code makes procedure calls which nest to a depth of greater than 16, then the OS has to copy the registers to main memory. This is VERY expensive in CPU time. Our test of Ackerman's function died after CPU times of 6.3 user, 107.5 sys (to do all those copies of the stack registers). It died because of a second flaw: the stack can only grow to a depth of 128K (about 1024 calls deep) by default. You can (at compile time) tell the system to allocate more stack space. I have not yet received an explanation of why they did this behaviour change from standard BSD, if there is a good reason, we could probably live with it, since few (other than Ackermann's) procedures get all that deep. However, the stack register array filling/unfilling is a more immediate concern, since it is quite expensive in CPU resources and it does happen. We noted that the C compiler rolled up fair amounts of system time (several times a Pyramid 90x), probably for stack growth. Another problem we noted was that the system calls we tried measuring ( some of those common to Sys V and 4.2 BSD) were on the average 20% slower than an 11/780, despite having a (by our tests) 3 times faster CPU. We are still trying to find out what was going on. My suspicion is the loading/unloading of the stack register set for context saves. If Celerity fixes the stack growth to be less painful, it is a very interesting machine for number crunching. Rich Hammond {allegra | decvax | ucbvax} bellcore!hammond
hammond@petrus.UUCP (04/23/85)
> ... > Disk Performance: Although my trivial benchmarks took almost the same amount > of CPU (using their new, faster cc) as the Pyramid, they took 3 times as > long in real time. Our Pyramid has eagles, the Celerity had the slower > 120Mb disks. I don't know what improvement an eagle would make. I meant to say that the compiles of the trivial benchmarks took almost the same user CPU, the benchmarks themselves are CPU bound and do no I/O. The system CPU on the Celerity was twice a Pyramid 90x (i.e. 6.5 vs 2.9) which I suspect was stack register copy times. The elapsed real times were more like 2+ than 3 times. (I just found my notes).