ucbesvax.turner@ucbcad.UUCP (06/15/83)
#R:noscvax:-13800:ucbesvax:12800001:37777777600:1049 ucbesvax!turner May 25 00:41:00 1983 The multiple-dedicated-processors scheme embodied in the BCC 500 might well have been an outgrowth of frustration with Berkeley's CDC 6400 system. From vague recollection of conversations with people remotely associated with it, CAL TSS (I think it was) was a bomb as a time-sharing system because Seymour Cray had done too good of a job of designing a batch machine. Nothing glaringly wrong with the software architecture (that I know), but it was poorly matched to the available hardware. The 6400 also has multiple-dedicated-processors, and was a phantasmagorical design. It was built entirely with discrete components at a time when everybody was going for IC's. Cray just didn't trust what he couldn't see, apparently. (Even now, his machines are built up from a very small repetoire of chips whose design he has overseen--except, perhaps, for the ECL RAMs.) Notwithstanding, the 6400 was not retired until last year, although it had been running in the red as an operation for quite some time. Michael Turner ucbvax!ucbesvax.turner
hal@cornell.UUCP (09/04/83)
#R:rlgvax:0:cornell:-1:37777777600:1213 cornell!hal Jul 3 09:47:00 1983 One of the articles in this conversation said something like "I don't understand why they [the 8008-8080 designers] didn't do a better job. After all, much was known about computer architecture at the time." Well, yes, but... How many good chip designers are also good computer architects? Perhaps there are some now, but back when the original microprocessors were built, probably very few people were good at both. The original microprocessors remind me a lot of early computer designs before very much was known about computer architecture. The hardware folks put together something that could execute instructions, then left it to the software folks to see if they could figure out how to use it effectively and come up with good code generators for compilers in spite of the instruction set. There are VERY few examples of good computer design. The only encouraging thing is that there is more awareness of how hard it is to do it right, and that a really good design must take into account lots more than circuits. Hal Perkins uucp: {decvax|vax135|...}!cornell!hal Cornell Computer Science arpa: hal@cornell bitnet: hal@crnlcs
johnl@haddock.UUCP (01/31/84)
#R:parsec:32800003:haddock:9500008:000:616 haddock!johnl Jan 30 10:31:00 1984 Without trying to fan the flames, I believe that the advantages of two's complement arithmetic over its main competitors, one's complement and sign-magnitude are: -- Unique representation of zero (the others have +0 and -0) -- Simpler design of adders and subtracters. The other two require tweaks like end-around carry to get the right answer. Multiplication and division are consquently simpler, too. -- Easier software implementation of multiple-precision aritmetic, since signed and unsigned addition are the same except for the interpretation of the result John Levine, ima!johnl
coulter@hpbbla.UUCP (05/14/84)
>> A lot depends on the C-2 beyond the survival of Cray Research.
I am intrigued by what the author meant by this. Would the author
(or anyone else) please tell me.
perry@hp-dcde.UUCP (perry) (01/10/85)
/***** hp-dcde:net.arch / dartvax!chuck / 8:35 pm Jan 9, 1985*/ Now suppose we had a cache that was much more under a programmer's control. To be concrete, suppose we have a cache of say, 32 elements each containing 32 words. And suppose our processor has a load cache instruction with syntax: load cache <cache address> <memory address> dartvax!chuck /* ---------- */ I think the biggest problem right now is the compiler technology. The compiler would have to recognize instruction locality when it occurs in a program. Another intelligent approach is locality lookahead, similar to the instruction lookahead which has been done on the big machines for years. This all sounds like a compiler-writer's nightmare to implement. I have not yet seen a compiler optimized for cache technology. Of course, assembly-level programmers could hand-code this instruction, but I think that would be a step backward (or sideways). You allude to the idea of a segmented cache. The idea of RISC-type register files might be applicable to caches. Especially in environments which tend to switch contexts rapidly (such as a program making a lot of OS calls), this may avoid the problem of voiding a valid cache on a context switch. Perry Scott !hplabs!hpfcla!perry
roy@phri.UUCP (Roy Smith) (01/15/85)
> Now suppose we had a cache that was much more under a programmer's control. > And suppose our processor has a load cache instruction with syntax: > load cache <cache address> <memory address> > dartvax!chuck > /* ---------- */ > > I think the biggest problem right now is the compiler technology. The > compiler would have to recognize instruction locality when it occurs in a > program. > Perry Scott, !hplabs!hpfcla!perry Maybe the compiler could get some help from the programmer. We already have register variables, why not have an extension to C which allows CACHEON and CACHEOFF keywords? OK you guys from the C Standards committee, flame on, I'm ready for you; so we'll call it C++prime-star :-). These could be used to bracket that section of code you want to make sure gets cached. It is probably a given that anything bigger than a micro will have some kind of cache (I'm not sure about 68K based systems, anybody know?) and the variability in the size, update strategy, and organization of caches should be no worse a problem than the corresponding differences in register sets. Besides, just as with the REGISTER keyword, the compiler takes this as a hint which it is free to ignore if it doesn't make sense for the particular machine architecture. Perhaps instead of having bracketed sections of code, we could have a CACHE keyword which is only valid in a function declaration and makes the whole function get cached (yeah, I know, what if the function is 2.3Kbytes of code and you only have 2K of cache?) This later version would be particularly good for things like interrupt routines which run often, but usually with enough other stuff in-between invocations to flush the cache. Possibly, you would want the load cache instruction to be kernel-mode-only to keep user code from hogging it??? Perhaps you want 1 part of the cache to be programmer allocatable using the load cache instruction and part to be up for grabs??? Perhaps even 1 part reserved for kernel allocation, 1 part for user allocation, 1 part up-for-grabs data and 1 part up-for-grabs text (hey, it's got to be power-of-2, right?) Hey, why not do something really clever and put J-11's in all the device controllers and give everybody a micro-vax II in their terminal so the cpu doesn't have to do anything at all :-) -- Don't blame me, I just work here.... {allegra,seismo,ihnp4}!vax135!timeinc\ cmcl2!rocky2!cubsvax >!phri!roy (Roy Smith) philabs!cubsvax/
peterb@pbear.UUCP (01/29/85)
Personally I think that cache should not be under programmer control (don't flame me!). All the memory management models assume a cache of fixed size with random age distrubution. If you start loading and unloading the cache, the approximations for the MM models all go to hell, and you can inadvertantly kill the advantage of cache. Peter Barada ima!pbear!peterb
barrett@hpcnoe.UUCP (barrett) (03/10/85)
I don't know about the rest of the world, but 24 bit PHYSICAL address spaces are somewhat marginal. I have seen single-user machines with 16MB of physical memory. 24 bits for VIRTUAL addressing is absurdly small. Dave Barrett hplabs!hp-dcd!barrett or ihnp4!hpfcla!barrett
richardt@orstcs.UUCP (richardt) (07/10/85)
[sacrifice to the line ea
>Most of the UN*X utilities take less than 64k
ha! Try doing a `TOP` sometime. Although many of the standard utilities
do have code in the 1k- range, their data spaces tend to grow like crazy.
When you see a 'ls 1k code 117k data' you'll understand. Also, think
about what routines/programs use more than 64k. Emacs, Rogue, Top, CSH,
probably SH and VI as well. All of those were larger than 100K. So forget
using 64k segments if you can possibly avoid it. I will admit that
the resident sizes of those utilities is in the 10k-40k range. However,
that other 90-100K still has to be addressed!
-------------------------
Is there an assembly-language programmer in the house?
orstcs!richardt
henry@utzoo.UUCP (Henry Spencer) (07/15/85)
> >Most of the UN*X utilities take less than 64k > > ... Although many of the standard utilities > do have code in the 1k- range, their data spaces tend to grow like crazy. > When you see a 'ls 1k code 117k data' you'll understand. Also, think > about what routines/programs use more than 64k. Emacs, Rogue, Top, CSH, > probably SH and VI as well. All of those were larger than 100K. ... Not "sh", since it was written by people who knew what they were doing. It's hardly surprising that csh, vi, and rogue are bloated, considering where they were written. The Berkloids have forgotten how to make anything small. And emacs is well-known to be elephantine. Try looking more carefully at /bin sometime; there are *lots* of small programs in there, and a few big ones. The original comment was correct. "What's that you say? 4.3BSD 'echo' is 150KB? I'm not surprised." -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
peterb@pbear.UUCP (07/25/85)
It is trivial to determine overflow in a word X word = word operation. Just do the standard multiply producing 2*word bits and check the high order word for non-zero(base 1) values. If this is so, then to result has overflowed. This check can be built in and executed at the end of the multiply as the resultant lower word result is placed in the destination. It is not that expensive, requiring a few clocks at best. Peter Barada {ihnp4!inmet|{harvard|cca}!ima}!pbear!peterb
dougp@ISM780.UUCP (11/18/85)
How come this blather seems to get posted every two weeks whether we need it or not???
allen@uicsrd.CSRD.UIUC.EDU (11/22/85)
If this blather gets posted every 2 weeks why did you respond to 4 week old blather? Wouldn't it be more appropriate to flame the most recent blatherings?
aglew@ccvaxa.UUCP (05/14/86)
>/* Written 7:08 pm Apr 29, 1986 by mac%uvacs@uvacs.UUCP in net.arch */ >> .... Losing indexing for array accesses isn't too bad, since >> just about every expensive array operation can be written using >> pointers instead of indexes (post-increment doesn't require a >> carry to perform a memory address). > >No semi-random access by subscripts? Your machine is to be >programmed exclusively in C or assembler, never Fortran. The Fortran codes to worry about are the matrix processing codes, like the Livermore loops. A hell of a lot of money has been spent figuring out how to vectorize, and use other optimizations, for these - you can almost buy them off the shelf. > >> So what can you do? Adding an index register in slows down the rest of your >> instruction set, > >Even when you don't index? Yep. (1) a short pipeline machine - pipeline stage has to be the size of an addition. (2) add extra pipeline stages to absorb the addition - since there is nearly always a chain from the front to the back of the pipe, you've slowed it down again. >> .... and, if memory is cheap, you might be willing to pad out a >> lot more of your structures, to get a speed increase. > >or use a separate add instruction. Slow. That's the point - if you aren't worried about speed you can use the ADD, and get all the flexibility you want. If you are worried about speed, then you could use the OR. Now, if a separate register-to-register OR instruction were faster than a register-to-register ADD... too bad asynchronous systems are out of fashion.
aglew@ccvaxa.UUCP (05/14/86)
I'm told the Star has pages of 512 and 64K. Fujitsu's new machine has pages of 4K and 1M. I believe that Convex's machine has multiple page sizes, although I do not know if these are different page sizes as seen from the page table, or different sizes as seen from the point of view of demand paging (ie. variable size clusters). Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms
parafras@uicsrd.UUCP (07/14/86)
The input language to Parafrase is FORTRAN 66 with some extensions. The output is a FORTRAN-like language that cannot be compiled on any FORTRAN compiler. The most noticeable differences are: 1) loop types other than DO, notably DOALL, DOPIPE, DOACROSS, etc. 2) non-standard variable names. &, ', lower case, etc. are included in variable names. This is good in that a user of Parafrase can tell what transformation caused some temporaries to be created. This is bad because it's not standard. It would not be much trouble to write a pass that would convert Parafrase output FORTRAN back to standard serial FORTRAN, but then you really haven't gained a whole lot. It would be more useful to convert Parafrase output to a general standard parallel FORTRAN, but unfortunately such a beast does not yet exist.