davecb@yunexus.YorkU.CA (David Collier-Brown) (07/31/90)
In <5465@darkstar.ucsc.edu> Craig Partridge writes: | I'm curious. Has anyone done research on building extremely | fast file systems, capable of delivering 1 gigabit or more of data | per second from disk into main memory? I've heard rumors, but no | concrete citations. puder@zeno.informatik.uni-kl.de (Arno Puder) writes: | Tanenbaum (ast@cs.vu.nl) has developed a distributed system called | AMOEBA. Along with the OS-kernel there is a "bullet file server". | (BULLET because it is supposed to be pretty fast). | | Tanenbaum's philosophy is that memory is getting cheaper and cheaper, | so why not load the complete file into memory? This makes the server | extremely efficient. Operations like OPEN or CLOSE on files are no | longer needed (i.e. the complete file is loaded for each update). Er, sorta... You could easily write an interface that did writes or reads without open or closes, for some specific subset of uses. | | The benchmarks are quite impressive although I doubt that this | concept is useful (especially when thinking about transaction | systems in databases). Well, I have something of the opposite view: a system like Bullet makes a very good substrate for a database system. The applicable evidence is in the article "Performance of an OLTP Application on Symmetry Multiprocessor System", in the 17th Annual International Symposium on Computer Architecture, ACM SigArch Vil 18 Number 2, June 1990. (see, a reference (:-)) The article uses all-in-memory databases in the TP1 benchmark as a limiting case while investigating the OS and architectural support that are necessary for good Transaction Processing speeds, and the speeds are up in the range that Craig may find interesting... My speculation is that a bullet-like file system with a relation- allocating layer (call it the Uzi filesystem? the speedloader filesystem??) on top would make a very good platform for a relational database. Certainly the behavior patterns of an in-memory, load-whole-relation database would be easy to reason about, and would be easy and interesting to investigate. | You can download Tanenbaum's original paper (along with a "complete" | description about AMOEBA) via anonymous ftp from midgard.ucsc.edu | in ftp/pub/amoeba. --dave -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
rminnich@super.ORG (Ronald G Minnich) (08/05/90)
>puder@zeno.informatik.uni-kl.de (Arno Puder) writes: >| Tanenbaum's philosophy is that memory is getting cheaper and cheaper, >| so why not load the complete file into memory? This makes the server >| extremely efficient. Operations like OPEN or CLOSE on files are no >| longer needed (i.e. the complete file is loaded for each update). Yes, note that BULLET does not support a write command. You create, delete, and read files. If memory serves, thats it. This is very elegant, but there is a problem. We're running out of address bits again. I gave the standard "why shared memory is a nice way to do a high speed network interface" talk the other day and someone pointed out that on Multics, with memory-mapped files, you always had to support the read-write interface for any program because the address space of the machine was too small for the memory-file abstraction to cover all files, and if your program couldn't deal with all files, it was useless. So you hacked your program: if the file is too big, do it read write, otherwise use memory files. And people realized that it was easier just to do read write, and stopped bothering with memory files. Obviously this applies to architectures we have now: lots of files are bigger than the 4 Gb address space of my Sun, and things are not getting any better. And of course on Crays you don't get memory-mapped files at all. So the programs I now write that use memory-mapped files on SunOS always have an out in the event that the mmmap fails or the system I am on does not support it. Conclusion: Bullet is really cool, as are memory-mapped files, but their eventual utility is limited by computer architecture questions. Since read-write is more general, maybe it is the wave of the future. Gee, i don't like that! I have several programs that got lots faster because calls to read() were replaced by an address computation. But my architectures have left me in a bind. For your example gigabyte/second file system I can run through my Sun's address space in 4 seconds. Now what do we do? Maybe the next round of address spaces should be large enough to address all the atoms on the planet- that should cover us for a while. ron -- 1987: We set standards, not Them. Your standard windowing system is NeUWS. 1989: We set standards, not Them. You can have X, but the UI is OpenLock. 1990: Why are you buying all those workstations from Them running Motif?
davecb@yunexus.YorkU.CA (David Collier-Brown) (08/06/90)
puder@zeno.informatik.uni-kl.de (Arno Puder) writes: | Tanenbaum's philosophy is that memory is getting cheaper and cheaper, | so why not load the complete file into memory? This makes the server | extremely efficient. Operations like OPEN or CLOSE on files are no | longer needed (i.e. the complete file is loaded for each update). rminnich@super.ORG (Ronald G Minnich) writes: | This is very elegant, but there is | a problem. We're running out of address bits again. | | I gave the standard "why shared memory is a nice way to do a high speed | network interface" talk the other day and someone pointed out that on | Multics, with memory-mapped files, you always had to support the read-write | interface for any program because the address space of the machine was too | small for the memory-file abstraction to cover all files [...] To again misquote Morven's Metatheorum, ``any problem in computer science can be solved with one more level of indirection...'' This is dealt with by doing a transparent interface on top of the large files, something like Multics MSFs, but done so the ill-advised applications programmer (me --dave) won't depend on knowing how it was implemented. I specifically considered large, relational, databases built on a bullet fileserver! The primitives provided to the DBMS would be read, write, committ and abort on a pre-initiated relation. Many relations would be small enough to load into a proprely-configured fileserver, some would not. The overlarge ones sould be slit two ways: transversely or longitudinally. Transversely would be transparent to the application, if not to the human DBM (he'd detect a performance loss, probably). Longitudinally would be visible to the applications (in a two-schema DBMS), because the DBM would have to split them based on field-usage statistics. It wouldn't be a problem in a three-schema architecture (modulo fiascos). Note that this is not a general answer to the problem, though. Full generality does require some form of ``extra-long address'', whether implemented as a segment number sequence, a ``special large address'' in either hardware or software, or a stdio FILE emulation library that only provided it for seek/tell operations and hid it otherwise. I wouldn't mind tha latter too much: it's a nice interface for 90% of the programs I've ever written, since they mostly read and wrote small sequential files... The other 10% took the other 90% of my time (:-)). --dave -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
jesup@cbmvax.commodore.com (Randell Jesup) (08/07/90)
In article <30728@super.ORG> rminnich@super.UUCP (Ronald G Minnich) writes: >>puder@zeno.informatik.uni-kl.de (Arno Puder) writes: >>| Tanenbaum's philosophy is that memory is getting cheaper and cheaper, >>| so why not load the complete file into memory? This makes the server >>| extremely efficient. Operations like OPEN or CLOSE on files are no >>| longer needed (i.e. the complete file is loaded for each update). ... >This is very elegant, but there is >a problem. We're running out of address bits again. ... >and stopped bothering with memory files. Obviously this applies >to architectures we have now: lots of files are bigger than the 4 Gb address >space of my Sun, and things are not getting any better. And of course on >Crays you don't get memory-mapped files at all. So the programs I now write >that use memory-mapped files on SunOS always have an out in the event that the >mmmap fails or the system I am on does not support it. Conclusion: Bullet is >really cool, as are memory-mapped files, but their eventual utility is >limited by computer architecture questions. Since read-write is more general, >maybe it is the wave of the future. Gee, i don't like that! I submit that your situation is something of an unusual case, and is likely to remain unusual for at least a decade, perhaps 2. Few machines (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB (I've never even seen a file larger than 100MB, even on mainframes). Eventually, perhaps, but not in the near future. There are people who have greater needs, that's the whole justification for the selling of supercomputers, and the vastly expensive (read fast & large) IO systems that support them. But they're a tiny minority, numbers-wise. Until the number of people that require such things increases sufficiently, the only architectures to support the extra address bits will be the super-(and maybe mini-super-)computers. Those extra address bits are _not_ free, in silicon, memory, etc. (I hope we haven't started the 32+ addr bit rwars again...) -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com BIX: rjesup Common phrase heard at Amiga Devcon '89: "It's in there!"
dave@fps.com (Dave Smith) (08/07/90)
In article <13526@yunexus.YorkU.CA> davecb@yunexus.YorkU.CA (David Collier-Brown) writes: >puder@zeno.informatik.uni-kl.de (Arno Puder) writes: >| Tanenbaum's philosophy is that memory is getting cheaper and cheaper, >| so why not load the complete file into memory? This makes the server >| extremely efficient. Operations like OPEN or CLOSE on files are no >| longer needed (i.e. the complete file is loaded for each update). > I thought the Bullet file server was neat, but...I showed the paper to one of our customers. He read through it and laughed. He said that he wanted to have files bigger than his main memory, that was the whole point of disks. The Bullet server doesn't address the problems of very large files. I also see problems with the Bullet server when two large files (~ as large as the memory size) are needed at the same time. -- David L. Smith FPS Computing, San Diego ucsd!celerity!dave or dave@fps.com
davecb@yunexus.YorkU.CA (David Collier-Brown) (08/07/90)
In article <30728@super.ORG> rminnich@super.UUCP (Ronald G Minnich) writes: [in a discussion of Tannenbaum's Bullet fileserver] | >This is very elegant, but there is | >a problem. We're running out of address bits again. jesup@cbmvax.commodore.com (Randell Jesup) writes: | I submit that your situation is something of an unusual case, and is | likely to remain unusual for at least a decade, perhaps 2. Few machines | (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB | (I've never even seen a file larger than 100MB, even on mainframes). Alas, I worked on a library system under Unix... you wouldn't believe the space costs of describing one book (:-)). I kept having to check that expected file maximums wouldn't exceed disk-drive sizes each time the DBMS vendor postponed using raw partitions gain. 100MB was perfectly plausible, and we had to plan for at least three existing sites well over that size. Obviously we split this across many "files". | Eventually, perhaps, but not in the near future. There are people | who have greater needs, that's the whole justification for the selling of | supercomputers, and the vastly expensive (read fast & large) IO systems that | support them. But they're a tiny minority, numbers-wise. Until the | number of people that require such things increases sufficiently, the only | architectures to support the extra address bits will be the super-(and maybe | mini-super-)computers. Those extra address bits are _not_ free, in silicon, | memory, etc. (I hope we haven't started the 32+ addr bit rwars again...) It's always a bad idea to put a hard addressing limit on things: Intel, based on their past needs, octupled the addressable memory available when they introduced the 8086, even though they expected 16k was adequate. Experience has shown them wrong (;-)). They needed to increase it by a somewhat larger factor [see, we're back to computer architecture again]. I claim that this applies to files, too, and eventually to disks. If you put hard limits in, people crash up against them. In the narrow case of Bullet, you can reinvent multi-segment files[1], improve the addressing capabilities of the hardware or a combination of the two. Or other ideas I haven't even dreamed of yet. I confess I'd have **real** trouble selling a raw bullet file system to a customer doing anything but cad/cam, software development[2] or small databases. --dave [1] A file which has multiple parts (segments), and which can be manipulated transparently as if it had only one part. Multics MSFs were the first, but weren't transparent enough. If you substitute segment for page in the above, the mechanism used to implement it becomes almost obvious. Alas, it still requires a loooooooong integral variable somewhere user-accessable for positioning oneself. [2] Which includes academic computing, you understand: that's what I do these days. -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
davecb@yunexus.YorkU.CA (David Collier-Brown) (08/07/90)
dave@fps.com (Dave Smith) writes: | I thought the Bullet file server was neat, but...I showed the paper to | one of our customers. He read through it and laughed. He said that | he wanted to have files bigger than his main memory, that was the | whole point of disks. Agreed. I think I'd like your customer (:-)). | The Bullet server doesn't address the problems of very large files. I also | see problems with the Bullet server when two large files (~ as large as the | memory size) are needed at the same time. If one restricts the problem to the fileserver, and agrees that the data will appear on the compute engine as a series of pages when needed, then one need merely (ahem) ensure that the fileserver has enough memory for a number of complete files and can feed the required pages to the compute machine when asked. This kind of fileserver is only slightly different from what we have now, but it's a difference in **nature**, so it won't be easy to do "right". Right now, I'd have to restrict Bullet to easily decomposable data problems, like software development and teaching (lots of little files), CAD/CAM (moderatly monstrous files), and with a bit of arm-waving, transaction processing. In my next rant[tm], I'll touch on architectural support for VERY LARGE FILES (:-)). --dave -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
mash@mips.COM (John Mashey) (08/07/90)
In article <13667@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes: >... >>This is very elegant, but there is >>a problem. We're running out of address bits again. >... > I submit that your situation is something of an unusual case, and is >likely to remain unusual for at least a decade, perhaps 2. Few machines >(percentage-wise) even have 4 GB of storage, let alone files larger that 4GB >(I've never even seen a file larger than 100MB, even on mainframes). > > Eventually, perhaps, but not in the near future. There are people >who have greater needs, that's the whole justification for the selling of >supercomputers, and the vastly expensive (read fast & large) IO systems that >support them. But they're a tiny minority, numbers-wise. Until the >number of people that require such things increases sufficiently, the only >architectures to support the extra address bits will be the super-(and maybe >mini-super-)computers. Those extra address bits are _not_ free, in silicon, >memory, etc. (I hope we haven't started the 32+ addr bit rwars again...) Well, there are always less higher-end things than lower-end ones. However, I'd STRONGLY disagree with the idea that 64-bit machines will remain confined to the super- & minisuper world for 10-20 more years. I propose instead: a) We are currently consuming address space at the rate of 1 bit year. b) Plenty of applications already exist for workstation-class machines, for which the devlopers bitterly complain that they only have 31 or 32 bits of virtual address space, regardless of how much physical address space they have. Note that they want bigger physical memories, also, of course. However, the real issue is being able to structure applications conveniently, and then slide various amounts of real memory underneath. I've participated in customer meetings (commercial, not even scientific), in which people complained seriously that some microprocessor-based machine of ours started with 256MB as maximum memory. They were more happy to know we'd get 1GB soon, but they still grumbled that it should be higher.... c) Observe that there already exist desktop workstations that support max physical memories in the 128MB - 512MB range, using 4Mb DRAMs. Hence, by the 64Mb DRAM generation, one can expect 2GB - 8GB maxes. After all, at that point, you can get 4GB or so within a 1-ft cube. Of course, such things will not be on every desktop. However, people will certainly expect the servers to be able to do such things, and they'll certainly want workstations and servers that run the same code, especially since the economics of this business mandate that a company's smaller servers be derived from the workstations. So, here's my counter-prediction to the idea that it will be 10-20 years: No later than 1995: 1) There will be, in production, 64-bit microprocessors (and I mean 64-bit integers & pointers, not just 64-bit datapaths, which micros have had for years in FP). They'll cost < $500 apiece, i.e., less than a 486 does today. They'll either be new architectures, or derivations of existing RISCs. 2) In fact, they'll be shipping in systems, in reasonable quantities. Let me try a market-analyst prediction and claim that there will be at least 50,000 such machines out there by YE1995, and 150,000 by YE1996. Now, 150,000 machines is not a huge number .... but it's rather larger than the number of supers and minisupers.... So, here's a thought to stimulate discussion: What applications (outside the scientific / MCAD ones that can obviously consume the space) would benefit from 64-bit machines? Why? (for example, here are some low-level reasons why a a particular one might benefit): a) Need more physical memory, and thus more virtual address to deal with it conveniently. b) Need more virtual memory, to address a lot of data at once, and so probably need more phyiscal memory also. c) Need more virtual memory, sometimes sparsely addressed, to use algorithms and design approaches to make the software reasoanble, but possibly with less physical memory than b). -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
davecb@yunexus.YorkU.CA (David Collier-Brown) (08/08/90)
mash@mips.COM (John Mashey) writes: |So, here's a thought to stimulate discussion: | What applications (outside the scientific / MCAD ones that | can obviously consume the space) would benefit from 64-bit | machines? | Why? (for example, here are some low-level reasons why a | a particular one might benefit): | a) Need more physical memory, and thus more virtual address | to deal with it conveniently. | b) Need more virtual memory, to address a lot of data at once, | and so probably need more phyiscal memory also. | c) Need more virtual memory, sometimes sparsely addressed, | to use algorithms and design approaches to make the software | reasoanble, but possibly with less physical memory than b). Well, transaction processing machines with many large (effectively sparse (:-)) databases to manipulate might well require fileservers with large spares address spaces, even if the machines doing the computing didn't need all of the data at once. Note that TP, like real-time, is an environment where the implausible is done regularly, and the inelegant daily (:-)), all in the name of performance. --dave (rant[tm] coming: get your K key ready) c-b -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
pha@caen.engin.umich.edu (Paul H. Anderson) (08/08/90)
In article <13667@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes: > > (discussion about mapping 4G+ files in workstations deleted) > > I submit that your situation is something of an unusual case, and is >likely to remain unusual for at least a decade, perhaps 2. Few machines >(percentage-wise) even have 4 GB of storage, let alone files larger that 4GB >(I've never even seen a file larger than 100MB, even on mainframes). > > Eventually, perhaps, but not in the near future. There are people >who have greater needs, that's the whole justification for the selling of >supercomputers, and the vastly expensive (read fast & large) IO systems that >support them. But they're a tiny minority, numbers-wise. Until the >number of people that require such things increases sufficiently, the only >architectures to support the extra address bits will be the super-(and maybe >mini-super-)computers. Those extra address bits are _not_ free, in silicon, >memory, etc. (I hope we haven't started the 32+ addr bit rwars again...) > In order to make computers very useful to social scientists, for studies of econometric or populations data, large data sets will be the norm. Populations Studies Center, for example, would like nothing better than to quickly analyze 5 gigabyte datasets (hence my earlier request for large RAM systems). Furthermore, many such datasets exist. The 1990 census is just one 5 gigabyte file - there are similar files for the last 100 years or more. Likewise for China, Russia, Europe, and more. Analyzing these things quickly is not currently very easy, but that doesn't mean that people don't want to do it. The demand is there now for computer systems that can deal with these problems. It may be some time before the demand and the cost for meeting that demand meet, but make no mistake, the demand is there right now! Paul Anderson University of Michigan
cliffc@sicilia.rice.edu (Cliff Click) (08/08/90)
In article <1990Aug7.190719.7907@caen.engin.umich.edu> pha@caen.engin.umich.edu (Paul H. Anderson) writes:
[ ...stuff about huge files... ]
Seems a step in the right direction would be to include transparent "bignums"
as a standard part of a programming language. Thus the applications
programmer writes his programs that don't care how big the file gets (the
system read/write/seek call must handle "bignums"). The smart compiler
can figure out when everything's going to remain as "small integers" and
skip the expensive run-time check code when it can be avoided.
How the OS maps the "bignum" to physical devices is a less difficult nut
to crack- one can always shoot for an easy & slow solution (bank swapping,
paging, etc...).
The problems are 1) getting applications into a world/language which has
no size restrictions on integers, and 2) getting compilers which can prevent
the grotesque performance hits from generic "bignum" handling (a topic
which is somewhat close to my heart ;-).
Cliff Bignum Click
--
Cliff Click
cliffc@owlnet.rice.edu
khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) (08/08/90)
In article <1990Aug7.190719.7907@caen.engin.umich.edu> pha@caen.engin.umich.edu (Paul H. Anderson) writes:
...
Populations Studies Center, for example, would like nothing better than to
quickly analyze 5 gigabyte datasets (hence my earlier request for large
RAM systems). Furthermore, many such datasets exist. The 1990 census
is just one 5 gigabyte file - there are similar files for the last
100 years or more. Likewise for China, Russia, Europe, and more.
Analyzing these things quickly is not currently very easy, but that
doesn't mean that people don't want to do it.
...
Humm. In estimation problems there are lots of ways to skin cats.
Algorithms which have huge datasets, but "small" models do not require
huge "core" storage.
In the satallite tracking biz, some experiements (like GPS baselines)
go on for years, and Tb of data could be necessary if one formed the
obvious
T
A A
and proceeded to use elimination from there.
Back when I did that sort of work, we employed Square-Root Information
Filters, and/or UDU**T decomposition techniques. If, for the sake of
argument, your model has 70 independent variables, the bulk of the
"core" needed is
(70+71)/2 = 71 words of storage
_independent_ of the size of the dataset. Of course, one also gets
estimates in "real time" (viz as fast as the data are available).
The "naive" approach would require that the entire dataset fit in
"core".
I am sure that there are many problems which require really huge
memories ... but I am certain that use of appropriate algorithms can
limit the number of such "hogs" considerably.
Those interesed in SRIF and UD techniques might wish to peruse
Factorization Methods for Discrete Sequential Estimation
ISBN 0 12 097350 2
--
----------------------------------------------------------------
Keith H. Bierman kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33 | (415 336 2648)
Mountain View, CA 94043
pha@caen.engin.umich.edu (Paul H. Anderson) (08/08/90)
In article <KHB.90Aug7132932@chiba.Eng.Sun.COM> khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) writes: > >Humm. In estimation problems there are lots of ways to skin cats. >Algorithms which have huge datasets, but "small" models do not require >huge "core" storage. > >In the satallite tracking biz, some experiements (like GPS baselines) >go on for years, and Tb of data could be necessary if one formed the >obvious > T > A A > >and proceeded to use elimination from there. > This is very true, but researchers benefit enourmously from interactive computation, where the type of one computation may depend on the outcome of the preceding ones. Ideally, students in classes would be able to investigate problems interacitely in a matter of minutes what it currently takes researchers many months to accomplish. The thing that prevents this from taking place currently is that the datasets are on magtapes, and therefore any computation using the entire set of data is forced to sequentially access a very slow medium. The current technique is for a researcher to identify the smallest possible subset that has all the information they think they need. This is coded up in a program that is run on a 3090 with lots of big fast expensive tape drives. Eventually, the researcher gets the information they asked for, and if they are lucky, it is actually what they need. The problem is that the process of exploration doesn't match this kind of turnaround at all, so as a result, highly competent social scientists sit around on their collective backsides, waiting for data to show up on their desks. Computers that can address this problem are needed now, independently of whether or not they actually exist. We don't have a real problem optimizing use of the hardware we have now, it is just that the available hardware has too little RAM, or has filesystems that are too slow. Paul Anderson University of Michigan
davecb@yunexus.YorkU.CA (David Collier-Brown) (08/08/90)
cliffc@sicilia.rice.edu (Cliff Click) writes: [about adressing large files with bignums] | How the OS maps the "bignum" to physical devices is a less difficult nut | to crack- one can always shoot for an easy & slow solution (bank swapping, | paging, etc...). | The problems are 1) getting applications into a world/language which has | no size restrictions on integers, and 2) getting compilers which can prevent | the grotesque performance hits from generic "bignum" handling (a topic | which is somewhat close to my heart ;-). Well, I wouldn't go as far as to remove range types from integers, but I would propose a combined compiler->os->hardware solution along those general lines: [Caution: architectural speculation from a non-architect (a philosopher) follows: press "n" if not interested, "k" if you don't like rants.] Let us imagine 0) a language with different length integral types, at least one of which is longer than ``usual'', and specifically long enough to describe more memory than the biggest virtual address used by the target machine(s), 1) a library declaring a large integral type we will call a vaddr_t, which maps to some language-supported first-class construct, if only a struct vaddr_t { long foo[some number]; }; 2) an OS that uses vaddr_t's as the parameters to its a) file-positioning functions (like seek and tell), b) dereference operators, iff typing is preserved thereby, 3) hardware that can support either one (initially) or more lengths of virtual address. One can then write library functions which manipulate memory-mapped files in a larger memory ``model'' than the hardware supports, and later upgrade hardware, compilers and libraries as the supported types get cloaser to what the applcations writers are using. Of course, our imaginary architect and his friend the compiler-writer have just been handed an interesting task (:-)): 2) making a non-standard-length pointer usable, generating good code to access it, dereference it and compare it. (there are some classical tricks: bignum'rs probably can comment here) 2.5) folding the overlength construct down into usable chunks. (just as seeking and then referencing parts of the stdio buffer is done by, what else, stdio) in both the standard i/o and the memory-mapped i/o libraries, and 3) giving the above adequate, elegant hardware support. My personal speculation on (3) is that someoone will provide a LAA and SAA instruction-pair in a RISC machine: it stands for Load Absurd Address, and really means ``load less significant part of bloody oversize number into register, discarding the rest''. --dave (this has been a rant[tm] by...) c-b -- David Collier-Brown, | davecb@Nexus.YorkU.CA, ...!yunexus!davecb or 72 Abitibi Ave., | {toronto area...}lethe!dave Willowdale, Ontario, | "And the next 8 man-months came up like CANADA. 416-223-8968 | thunder across the bay" --david kipling
rbw00@ccc.amdahl.com ( 213 Richard Wilmot) (08/08/90)
davecb@yunexus.YorkU.CA (David Collier-Brown) wrote: > In article <30728@super.ORG> rminnich@super.UUCP (Ronald G Minnich) writes: > [in a discussion of Tannenbaum's Bullet fileserver] > | >This is very elegant, but there is > | >a problem. We're running out of address bits again. > ... stuff deleted > It's always a bad idea to put a hard addressing limit on things: > Intel, based on their past needs, octupled the addressable memory available > when they introduced the 8086, even though they expected 16k was adequate. > Experience has shown them wrong (;-)). They needed to increase it by a > somewhat larger factor [see, we're back to computer architecture again]. ... more deleted. >--dave Indeed. I began administering a database of 1,600 MB in 1974 for Kaiser Medical of Northern California. I think their membership has grown from the 1,000,000 active and 2,000,000 inactive of that time and Kaiser very likely keeps much more data about each member. This was done with an IBM 370/158 computer having 1 MB of main memory. All of the performance parameters of that hardware system can be easily exceeded by today's high-end PCs. I am certainly willig to bet that they've exceeded the 4,300 MB addressing limit of IBM's VSAM by now (for them there are ways around this). We always seem to exceed any addressing limit we can imagine. I still think it is time to stop trying to use integers for addressing. They always break down and probably always will. Many computers today have floating point units. I would like to see floating point used for addressing. It would help a great deal if addresses were not dense. What I have in mind is for data objects to have addresses but these would be floating point and between any two objects we could *always* fit another new object. In this way I could expand a file from a hundred bytes to a hundred gigabytes without changing the addresses of any stored objects. An index pointing to objects in such a file would never need to be adjusted as the file grew or shrank. Plastic addressing. This could still be efficient since addresses are almost never real anymore anyway. Addressing is used to locate things in high speed processor caches, in virtual memory, in virtual memory and on disk drives (and in disk controller caches). Integer addresssing is unsuited to all these different tasks. Fractional addressing could be flexible enough to allow for all these locational needs. Some things are nicely stored by hashing instead of by b*tree organization (e.g. person records by driver license number) (it minimizes update locking problems prevalent in b*trees as well as saving one or more extra levels of access. This is hard to do as a file grows but would be simple with a file addressed by fractions (0.5 is *logically* half way through the file). I think this was used by one of the graphics systems for describing picture objects (GKS?). So when will I see fractional addressing? -- Dick Wilmot | I declaim that Amdahl might disclaim any of my claims. (408) 746-6108
ccc_ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) (08/08/90)
In <10606@celit.fps.com>, dave@fps.com (Dave Smith) says "I thought the Bullet file server was neat, but...I showed the paper to one of our customers. He read through it and laughed. He said that he wanted to have files bigger than his main memory, that was the whole point of disks." Are you sure your customer isn't confusing the size of main *physical* memory with the size of *virtual* memory? Lawrence D'Oliveiro fone: +64-71-562-889 Computer Services Dept fax: +64-71-384-066 University of Waikato electric mail: ldo@waikato.ac.nz Hamilton, New Zealand 37^ 47' 26" S, 175^ 19' 7" E, GMT+12:00
cliffc@sicilia.rice.edu (Cliff Click) (08/08/90)
In article <dczq02zP01Hc01@JUTS.ccc.amdahl.com> rbw00@JUTS.ccc.amdahl.com ( 213 Richard Wilmot) writes: >davecb@yunexus.YorkU.CA (David Collier-Brown) wrote: > >> In article <30728@super.ORG> rminnich@super.UUCP (Ronald G Minnich) writes: >> [in a discussion of Tannenbaum's Bullet fileserver] >> | >This is very elegant, but there is >> | >a problem. We're running out of address bits again. >> > ... stuff deleted > > [ ...stuff about using fractional addressing deleted... ] > >So when will I see fractional addressing? Humm... infinite precision fractional addressing... isn't this equivalent to using "bignums"? Suppose I take a fractional addressing system, and generate a fraction digit-by-digit, 2 digits for each letter of a person's name. Suddenly I have a unique fraction which is some permutation of a name. In other words I have a system where names are addresses, and since names have no length limit, I have no limit to my file sizes. I say a rose by any other addressing mode is a rose; what your describing is nothing more than a key-access file system. Cliff Infinite Fractions Click -- Cliff Click cliffc@owlnet.rice.edu
nvi@mace.cc.purdue.edu (Charles C. Allen) (08/08/90)
> I submit that your situation is something of an unusual case, and is > likely to remain unusual for at least a decade, perhaps 2. Few machines > (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB > (I've never even seen a file larger than 100MB, even on mainframes). Until recently, the "standard" media for transporting files has been 9-track 6250 tape, which holds around 200M. Until recently, all our data files were less than 200M (hmm... I hope you see the correlation). Now that we have some 8mm tape drives, we routinely have 400M files. We'd have bigger ones, but all our disks are little SCSI 600-700M thingies (access time is not very critical), and we can't easily have a single file span volumes. This is for high energy physics data analysis. Charles Allen Internet: cca@newton.physics.purdue.edu Department of Physics nvi@mace.cc.purdue.edu Purdue University HEPnet: purdnu::allen, fnal::cca West Lafayette, IN 47907 talknet: 317/494-9776
dave@fps.com (Dave Smith) (08/08/90)
In article <1179.26bffdbf@waikato.ac.nz> ccc_ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) writes: >In <10606@celit.fps.com>, dave@fps.com (Dave Smith) says "I thought the >>Bullet file server was neat, but...I showed the paper to one of our >>customers. He read through it and laughed. He said that he wanted >>to have files bigger than his main memory, that was the whole point >>of disks." > >Are you sure your customer isn't confusing the size of main *physical* >memory with the size of *virtual* memory? Yes. We support (in the current product) up to 1GB of physical memory. Convex, Cray, etc. also support main memory sizes in that range. The customer in question does seismic processing and routinely has data sets in the 10GB range. They have a file server which pulls several smaller files together into a larger virtual file to get around our current limitations on maximum file size. Virtual memory for a Bullet-style server is kind of like using /dev/ram as your swap device. Why copy in from the disk just to copy it right back out again? -- David L. Smith FPS Computing, San Diego ucsd!celerity!dave or dave@fps.com
dave@fps.com (Dave Smith) (08/08/90)
In article <dczq02zP01Hc01@JUTS.ccc.amdahl.com> rbw00@JUTS.ccc.amdahl.com ( 213 Richard Wilmot) writes: >I still think it is time to stop trying to use integers for addressing. >They always break down and probably always will. Many computers today have >floating point units. I would like to see floating point used for addressing. >It would help a great deal if addresses were not dense. What I have in mind >is for data objects to have addresses but these would be floating point and >between any two objects we could *always* fit another new object. This won't work. There are only so many distinct numbers representable by floating point. There will be points where there is no "room" between two numbers because the granularity of the floating point doesn't allow a number to be represented between them. As a simple example with two digit decimal floating point (two digits of mantissa, one digit of exponent), find the number between 1.01x10^1 and 1.02x10^1. With a 64-bit (combined mantissa and exponent) floating point number there are 2^64 distinct numbers that can be represented. The range is very large, but the numbers are sparse. I liked the idea of "tumblers" as put forth by the Xanadu project. Variable length indices, that's the only way to go. I think I'll go over and hit the hardware engineers over the head until they figure out a way to make it fast :-). -- David L. Smith FPS Computing, San Diego ucsd!celerity!dave or dave@fps.com
clj@ksr.com (Chris Jones) (08/08/90)
In article <dczq02zP01Hc01@JUTS.ccc.amdahl.com>, rbw00@ccc ( 213 Richard Wilmot) writes: >We always seem to exceed any addressing limit we can imagine. This is very true, at least so far. >I still think it is time to stop trying to use integers for addressing. >They always break down and probably always will. Many computers today have >floating point units. I would like to see floating point used for addressing. >It would help a great deal if addresses were not dense. What I have in mind >is for data objects to have addresses but these would be floating point and >between any two objects we could *always* fit another new object. In this way >I could expand a file from a hundred bytes to a hundred gigabytes without >changing the addresses of any stored objects. Um, I think that between any two *real* numbers you can always find another real number. Real numbers are a mathematical concept, and what is called floating point on computers merely implements a useful approximation of them. On computers with floating point, it is most definitely not the case that you can always fit another floating point number between two other such numbers. These things take up a finite number of bits, right, and that means there's a finite limit to their ordinality. -- Chris Jones clj@ksr.com {world,uunet,harvard}!ksr!clj
usenet@nlm.nih.gov (usenet news poster) (08/08/90)
In article <40644@mips.mips.COM> mash@mips.COM (John Mashey) writes: >However, I'd STRONGLY disagree with the idea that 64-bit machines will >remain confined to the super- & minisuper world for 10-20 more years. >I propose instead: > a) We are currently consuming address space at the rate of 1 bit year. > b) Plenty of applications already exist for workstation-class machines, > for which the devlopers bitterly complain that they only have > 31 or 32 bits of virtual address space, ... > >So, here's my counter-prediction to the idea that it will be 10-20 years: > >No later than 1995: > 1) There will be, in production, 64-bit microprocessors > (and I mean 64-bit integers & pointers, not just 64-bit > datapaths, which micros have had for years in FP). Maybe, but aside from address generation and floating point, what are people going to do with all those bits? Setting aside address arithmatic, most of the time you don't need 32 bit integers and lots of work involves bytes or smaller (character strings etc.). >[...] >So, here's a thought to stimulate discussion: > What applications (outside the scientific / MCAD ones that > can obviously consume the space) would benefit from 64-bit > machines? Using 64-bit chunks makes me wonder about non-numeric data representations. You can pack 8-12 characters of text in 64-bits, enough for most English words. Or how about small images (8x8 or 7x9 fonts etc.)? Pattern recognition using small neural nets operating on one or two registers of input data? An instruction set rich in bit manipulation could be a big help in exploiting these possibilities. >-john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> David States
peter@ficc.ferranti.com (Peter da Silva) (08/08/90)
In article <1990Aug8.010203.18560@rice.edu> cliffc@sicilia.rice.edu (Cliff Click) writes: > In article <dczq02zP01Hc01@JUTS.ccc.amdahl.com> rbw00@JUTS.ccc.amdahl.com ( 213 Richard Wilmot) writes: > >So when will I see fractional addressing? When someone invents a floating point unit that maps to the real number system. > Humm... infinite precision fractional addressing... isn't this equivalent to > using "bignums"? Or Ted Nelson's "Tumblers"? "Any problem in programming can be solved with another level of indirection" -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
wayne@dsndata.uucp (Wayne Schlitt) (08/08/90)
In article <40644@mips.mips.COM> mash@mips.COM (John Mashey) writes: > In article <13667@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes: > >... > >[ ... ] (I hope we haven't started the 32+ addr bit rwars again...) > > Well, there are always less higher-end things than lower-end ones. > However, I'd STRONGLY disagree with the idea that 64-bit machines will > remain confined to the super- & minisuper world for 10-20 more years. ^^^^^ yes, 20 years is probably too long but i think that 10 years isnt too far off the mark. my guess is that it will be 7-10 years before 64bit computers are making inroads into the 32bit market. (maybe by then we will have finally gotten away from 16bit computers. 1/2 :-) > I propose instead: > a) We are currently consuming address space at the rate of 1 bit year. i thought it was only 1 bit per 18 months... new data? > b) [says that there are commercial people who already want > 31-32 bits of virtual address space, and they want more.] and they are putting these applications on everyone's desk...? or are these applications things that they would have run on mainframes or super-mini's if your "killer-micro's" werent chosen? > c) Observe that there already exist desktop workstations that > support max physical memories in the 128MB - 512MB range, > using 4Mb DRAMs. Hence, by the 64Mb DRAM generation, one can expect > 2GB - 8GB maxes. After all, at that point, you can get 4GB or > so within a 1-ft cube. using your own numbers, at one bit a year it will be 3-5 years before physical memory _maximums_ will reach 4GB. at one bit every 1.5 years, it will be more like 4.5-7.5 years. how long do you think it will be before 4GB is typical? also, when do you expect 64Mb DRAMS to come out? my guess would be around 5-6 years or so, and then it will take at least a year before they have ramped up production to the point that they are cheaper than 16Mb DRAMS. (as a reference point, would you consider 4Mb DRAMS "common" now? when do you think that 4Mb DRAMS became or will become "common"?) > [ .... ] -wayne
rminnich@super.ORG (Ronald G Minnich) (08/08/90)
In article <13667@cbmvax.commodore.com> jesup@cbmvax (Randell Jesup) writes: > I submit that your situation is something of an unusual case, and is >likely to remain unusual for at least a decade, perhaps 2. Few machines >(percentage-wise) even have 4 GB of storage, let alone files larger that 4GB >(I've never even seen a file larger than 100MB, even on mainframes). I couldn't disagree with you more. Now that storage is about $5K/gigabyte for Sun file servers (if you don't buy from Sun, that is) i would expect 4 Gb to be common. We have no Sun file servers here with < 4 Gb any more. I am using a demo Decstation 5000 right now with this little itty bitty box which contains a 1Gb disk. But that is a side issue. More important issue: suppose I find that there is a 6 Gb file at NCAR which shows a really neat ocean model. It is there, my workstation is here, so what do i do? Nowadays you do the easy thing: ftp it over the net. YYYYUUUUCCCCKKKK. No, wait, i forgot: buy plane tickets to Colorado. Now that is fun, but you have just left your entire environment behind in (my case) Bowie, Md. That is no good either: now i have to ftp my environment to Colorado! What I *want* to do is say: "when this program runs, please associate this 6Gb chunk of its address space with that file over there on NCAR". Problem solved. Only I don't have any architectures that will let me, because of architectural limitations. Well, maybe PA can do it, with its ability to address billions and billions of segments each of which can contain billions and billions of bytes. There are good reasons to have more than 32 bits *now*. ron -- 1987: We set standards, not Them. Your standard windowing system is NeUWS. 1989: We set standards, not Them. You can have X, but the UI is OpenLock. 1990: Why are you buying all those workstations from Them running Motif?
tom@nw.stl.stc.co.uk (Tom Thomson) (08/08/90)
In article <40644@mips.mips.COM> mash@mips.COM (John Mashey) writes: > using 4Mb DRAMs. Hence, by the 64Mb DRAM generation, one can expect > 2GB - 8GB maxes. After all, at that point, you can get 4GB or > so within a 1-ft cube. I sure do hope that physical dimension is wrong! If I can't do it in less than half a cubic foot by 1996 I've got problems. Of course, I don't believe I have any problem at all here. Tom [tom@nw.stl.stc.co.uk
aglew@dual.crhc.uiuc.edu (Andy Glew) (08/08/90)
..> Who can use 64 bit machines? As David States points out (and several comp.arch readers who have worked on machines with large registers have testified in the past) 64 bit registers are very nice for manipulating strings in, since many strings are shorter than 8 characters. Store partial operations (with or without an implied shift) are useful. COBOL like fixed field width operations are particularly well suited to large register widths, although null terminated C-style strings can easily be handled by a number of operations that have been added to RISCs like HP's. By the way, can anyone recall the name of the PDP-11 retrospective that advocated separate address and data registers? If I remember correctly, they figured that 32 bit addresses were needed, but 16 bit integers were enough for most people's needs, so why provide things like a 32x32 multiply? (I may have misremembered, and they may have advocated 16 bit addresses but 32 bit integers, but I don't think so). I will admit that this paper influenced me for quite a while (in fact, still does, in a mixed sense); I do not know for sure, but I would also reckon that it influenced the design of two popular microprocessor families. -- Andy Glew, andy-glew@uiuc.edu Propaganda: UIUC runs the "ph" nameserver in conjunction with email. You can reach me at many reasonable combinations of my name and nicknames, including: andrew-forsyth-glew@uiuc.edu andy-glew@uiuc.edu sticky-glue@uiuc.edu and a few others. "ph" is a very nice thing which more USEnet sites should use. UIUC has ph wired into email and whois (-h garcon.cso.uiuc.edu). The nameserver and full documentation are available for anonymous ftp from uxc.cso.uiuc.edu, in the net/qi subdirectory.
aglew@dual.crhc.uiuc.edu (Andy Glew) (08/08/90)
..> Who can use 64 bit machines? As David States points out (and several comp.arch readers who have worked on machines with large registers have testified in the past) 64 bit registers are very nice for manipulating strings in, since many strings are shorter than 8 characters. Store partial operations (with or without an implied shift) are useful. COBOL like fixed field width operations are particularly well suited to large register widths, although null terminated C-style strings can easily be handled by a number of operations that have been added to RISCs like HP's. By the way, can anyone recall the name of the PDP-11 retrospective that advocated separate address and data registers? If I remember correctly, they figured that 32 bit addresses were needed, but 16 bit integers were enough for most people's needs, so why provide things like a 32x32 multiply? (I may have misremembered, and they may have advocated 16 bit addresses but 32 bit integers, but I don't think so). I will admit that this paper influenced me for quite a while (in fact, still does, in a mixed sense); I do not know for sure, but I would also reckon that it influenced the design of two popular microprocessor families. Any bets on whether people will take the same limiting strategy (I won't call it a mistake, because it might be right for short-term goals) by providing 32 bit data registers and 64 bit address registers? -- Andy Glew, andy-glew@uiuc.edu Propaganda: UIUC runs the "ph" nameserver in conjunction with email. You can reach me at many reasonable combinations of my name and nicknames, including: andrew-forsyth-glew@uiuc.edu andy-glew@uiuc.edu sticky-glue@uiuc.edu and a few others. "ph" is a very nice thing which more USEnet sites should use. UIUC has ph wired into email and whois (-h garcon.cso.uiuc.edu). The nameserver and full documentation are available for anonymous ftp from uxc.cso.uiuc.edu, in the net/qi subdirectory.
rminnich@super.ORG (Ronald G Minnich) (08/09/90)
In article <1990Aug7.205747.14206@caen.engin.umich.edu> pha@caen.engin.umich.edu (Paul H. Anderson) writes: >We don't have a real problem optimizing use of the hardware we >have now, it is just that the available hardware has too little >RAM, or has filesystems that are too slow. And once it gets enough ram we will run out of address bits again! this ds 5000 on my desk as 128 mb of memory, and can have 512 mb. That is getting uncomfortably close to running out of address bits. I figure we will be there in two years, at a bit a year. I guess there was a 32-bit war here before, judging by earlier comments, but fact is we are about to run out. ron -- 1987: We set standards, not Them. Your standard windowing system is NeUWS. 1989: We set standards, not Them. You can have X, but the UI is OpenLock. 1990: Why are you buying all those workstations from Them running Motif?
seibel@cgl.ucsf.edu (George Seibel) (08/09/90)
In article <5286@mace.cc.purdue.edu> nvi@mace.cc.purdue.edu (Charles C. Allen) writes: >> I submit that your situation is something of an unusual case, and is >> likely to remain unusual for at least a decade, perhaps 2. Few machines >> (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB >> (I've never even seen a file larger than 100MB, even on mainframes). > >Until recently, the "standard" media for transporting files has been >9-track 6250 tape, which holds around 200M. Until recently, all our >data files were less than 200M (hmm... I hope you see the >correlation). Now that we have some 8mm tape drives, we routinely >have 400M files. We'd have bigger ones, but all our disks are little >SCSI 600-700M thingies (access time is not very critical), and we >can't easily have a single file span volumes. This is for high energy >physics data analysis. The important question here is: "what are these large files worth to you?" It sounds as though you've always had datasets larger than the limits imposed on you by hardware/software, and that you likely got by in the past (and present) by splitting data into multiple files. I generate a lot of data from MD simulations, but find that it's more convenient to split it into manageable chunks that are far smaller than 4GB. The size of "manageable" is of course determined by a variety of hardware/ software performance/capacity issues, plus economics and politics. At any rate, I've been splitting data files up for years, and I bet everyone else has been as well. I already have the software in place to deal with multiple files, and don't expect that the ability to have a gigantic single file will make a vast improvement in my life. I'm sure that someone out there needs huge files, but I also suspect there is a price to be paid for going to the next higher increment of address size. I would rather not pay that price until the performance level of network, cpu, memory, mass storage, etc has come to such a level that my "manageable" chunks of data are approaching the GB range. I guess it's up to market analysis to decide when "enough" people have reached the point where the benefits of a larger address space are worth the cost. This will of course depend on the good work of you designers and engineers. It's a balancing act. George Seibel, UCSF seibel@cgl.ucsf.edu
jonah@dgp.toronto.edu (Jeff Lee) (08/09/90)
rminnich@super.ORG (Ronald G Minnich) writes: > [...] More important issue: suppose I find that there >is a 6 Gb file at NCAR which shows a really neat ocean model. It is there, >my workstation is here, so what do i do? Nowadays you do the easy thing: >ftp it over the net. YYYYUUUUCCCCKKKK. No, wait, i forgot: buy plane >tickets to Colorado. Now that is fun, but you have just left your >entire environment behind in (my case) Bowie, Md. That is no good either: >now i have to ftp my environment to Colorado! >What I *want* to do is say: "when this >program runs, please associate this 6Gb chunk of its address space with >that file over there on NCAR". Problem solved. [...] Given the current data+program+interface modularization, there are at least four options: 1) hire a station-wagon full of mag-tapes (or send a DAT by over-night courier) [6MB would saturate a T1 line (1.5Mbit/sec) for 9.1 hours.] 2) split between the data and program (e.g. with distributed shared memory) 3) split between the program and interface (e.g. with Plan 9, X, or NeWS) 4) plan a quick holiday in Colorado (with Plan 9, your environemnt follows you automatically) It depends on where the data-flow volumes are and what are the cost breaks. If you plan to use all of the data more than once, grabbing your own copy is not a bad idea. If you are going to analyse the some or all of the data set and plot summary results, do it remotely and ship the plot back (batch or real-time). Only if you are planning to randomly access a *small* portion of this database does mapping all 6GB into your address space make sense. [And if you are randomly accessing a small part, then a remote file system might work almost as well as mapping it into memory.] I will agree though that most present operating systems will choke on the idea of a single 6GB random-access file -- or a 6GB virtual memory image. j.
mash@mips.COM (John Mashey) (08/09/90)
In article <3293@stl.stc.co.uk> "Tom Thomson" <tom@stl.stc.co.uk> writes: >In article <40644@mips.mips.COM> mash@mips.COM (John Mashey) writes: >> using 4Mb DRAMs. Hence, by the 64Mb DRAM generation, one can expect >> 2GB - 8GB maxes. After all, at that point, you can get 4GB or >> so within a 1-ft cube. >I sure do hope that physical dimension is wrong! If I can't do it in less >than half a cubic foot by 1996 I've got problems. Of course, I don't believe >I have any problem at all here. Of course. The comments wasn't intended to be a close-order estimate of the space, merely to note that it would be easy to get a lot of memory in a small box.:-) Note that if you use 64-bit wide memories (+ byte parity, to end up using 72 bits wide), and if you assume 64Mb-by-1 (which may or may not be best assumption), then the "natural" memory increment is 512MB (+ parity bits), using 72 DRAMs (8 SIMMs). Now, a MIPS Magnum, a desktop workstation has 32 SIMM slots (to get 32MB with 1Mb, 128MB with 4MB DRAMs), arranged in two rows of 16. Looks like the 1996 version could have 4X512MB = 2GB in that same space, and in fact, one would really need to get up around 8GB- 16GB to get close to a cubic foot. of course, you'd certainly want ECC memory instead, with such sizes, so some additional space would get chewed up. Although the rate of improvement in DRAM cost/bit seems to be slowing a bit, it's still OK. Of course, even if all of this is off a year or two, it still means that one will fairly soon (1995 is about as far away in one direction as the early commercial RISCs were in the other....) be able to easily build desktop/deskside computers whose memories are in current-supercomputer-or-bigger ranges.... Exercise 1: using the chart on page 55 of Patterson&Hennessy, predict the cost of the memory in the 512MB "entry system" described above (assuming it was parity, for simplicity), using 64Mb DRAMs. Hint: the cost certainly depends on the date! -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
mash@mips.COM (John Mashey) (08/09/90)
In article <WAYNE.90Aug8085346@dsndata.uucp> wayne@dsndata.uucp (Wayne Schlitt) writes: >> Well, there are always less higher-end things than lower-end ones. >> However, I'd STRONGLY disagree with the idea that 64-bit machines will >> remain confined to the super- & minisuper world for 10-20 more years. ^^^^^ >yes, 20 years is probably too long but i think that 10 years isnt too >far off the mark. my guess is that it will be 7-10 years before 64bit >computers are making inroads into the 32bit market. (maybe by then we >will have finally gotten away from 16bit computers. 1/2 :-) Well, 7 isn't too far from 5. >> I propose instead: >> a) We are currently consuming address space at the rate of 1 bit year. >i thought it was only 1 bit per 18 months... new data? Rough rule in any case. Hennessy&Patterson claim (page 16): "This translates to a consumption of address bits at a rate of 1/2 bit to 1 bit per year." Hennessy always says informally, that the old rule of 1/2 bit per year has tended to shift more towards 1 bit per year with MOS memories. In any case, this is a vague enough metric that these are in the same ballpark. :-) > >> b) [says that there are commercial people who already want >> 31-32 bits of virtual address space, and they want more.] >and they are putting these applications on everyone's desk...? or are >these applications things that they would have run on mainframes or >super-mini's if your "killer-micro's" werent chosen? Well, the one's I've heard of weren't for everybody's desk, but they would have liked them to be on some people's desks, and were running on architectures suitable for the desktop. > >> c) Observe that there already exist desktop workstations that >> support max physical memories in the 128MB - 512MB range, >> using 4Mb DRAMs. Hence, by the 64Mb DRAM generation, one can expect >> 2GB - 8GB maxes. After all, at that point, you can get 4GB or >> so within a 1-ft cube. >using your own numbers, at one bit a year it will be 3-5 years before >physical memory _maximums_ will reach 4GB. at one bit every 1.5 >years, it will be more like 4.5-7.5 years. how long do you think it >will be before 4GB is typical? Well, I don't think 4GB/desktop will be "typical" for a long time, if ever. I do think that there will be reasonable numbers of machines whose maximum memories are in this range, for a whole bunch of the typical reasons that cause systems to be built in certain ways. a) Note that the "bit/year" rule is really applicable to virtual memory, not necessarily physical memory, although the latter certainly correlate with the former. (Old saw: virtual memory is a way of selling more physical memory.) b) Anyone building a system will typically design it for at least 2 generations of DRAMS. Right now, at least DEC, MIPS, and Sun build desktops that use either 1Mb or 4Mb chips to cover various ranges. I'm sure most everybody else does, also. > >also, when do you expect 64Mb DRAMS to come out? my guess would be >around 5-6 years or so, and then it will take at least a year before >they have ramped up production to the point that they are cheaper than >16Mb DRAMS. (as a reference point, would you consider 4Mb DRAMS >"common" now? when do you think that 4Mb DRAMS became or will become >"common"?) I'd consider 4MB DRAMs "common" (not "prevalent"): multiple vendors have been delivering them in desktop systems already. (Some of the very first MIPS Magnums that got sold had 128MB maxed-out memories in them :-) Suppose you get 16Mb chips in the same state in 1993-1994, and 64Mb chips in 1996-1997, or 1995 if you're really lucky. Certainly, people who design systems tend to allow for at least 2 DRAM sizes in boards, so things designed 1 year before 64MB chips become practical will allow for them. All of this says that a Magnum-like design appearing in 1995, would like come out the door to use 16Mb chips, which would give a max memory of 512MB, and then upgrade to 2GB max. A DECstation5100-like design has space for 4X more memory. Again, I make no claims that such would be "typical" (whatever that is). However, people like to buy systems whose max memories are bigger than typical to leave them room for growth. Application areas that will tend to want this stuff quickly include: ECAD, MCAD, Image applications, geographic information systems, financial modeling, as well as databases. Observe that large memories are one of the few obvious helps for DBMS read-performance assistance, so you'll see it in the commercial world, as well. (back to virtual-address space) Finally, all of the economics of the business say that people like to have ranges of machines that can run the same software. It may well be that you may well have servers that have massive amounts of memory (sorry, lots of people WILL have servers with massive amounts of memory), and smaller desktop/desksides, but you'd certainly like to run the same applications on both at least some of the time, even if you back up the desktop with less physical memory. Again that's why I'd claim that 1995-micros will want to either be 64-bit ones, or least have 64-bit modes. Note that with the number of tranisistors likely to be available, you can probably stuff a 32-bit CPU in the corner of your 64-bit one to handle backward compatibility. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: mash@mips.com OR {ames,decwrl,prls,pyramid}!mips!mash DDD: 408-524-7015, 524-8253 or (main number) 408-720-1700 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (08/09/90)
In article <1990Aug8.042631.7093@nlm.nih.gov>, usenet@nlm.nih.gov (usenet news poster) writes: > Setting aside address arithmatic, > most of the time you don't need 32 bit integers and lots of work involves > bytes or smaller (character strings etc.). For character strings, note that ANSI C's wchar_t type is already bigger than a byte, and that this is driven by user demand (for Kanji), and that the ISO 10646 standard which is currently under development encodes characters in *32* bits. (I do hope that ISO 10646 will include the cuneiform characters, they've room for them, after all...) Given that COBOL requires support for a minimum 18 decimal digits, 64 bits sounds just about right for "ordinary" programs. For several years I had the pleasure of using a machine with 40 bit integers (39 bits of magnitude, 1 bit of sign, in a 48-bit word) and somehow it was never _quite_ enough. Give people 64 bits and they'll use them, don't you worry about that. -- Taphonomy begins at death.
rminnich@super.ORG (Ronald G Minnich) (08/09/90)
In article <1990Aug8.195229.23544@jarvis.csri.toronto.edu> jonah@dgp.toronto.edu (Jeff Lee) writes: >1) hire a station-wagon full of mag-tapes (or send a DAT by over-night >courier) [6MB would saturate a T1 line (1.5Mbit/sec) for 9.1 hours.] T1 is now slow. 6 GB would saturate an NREN for about 50 seconds. If you think I am saying that mapping the 6GB NCAR file in is equivalent to copying the whole thing over when your program starts up, I am not. That is just another form of FTP. >2) split between the data and program (e.g. with distributed shared memory) Well, I am partial to that solution, not the least because hardware implementations of such things have been done at least once, and I can't see using read and write calls to drive an NREN. >4) plan a quick holiday in Colorado (with Plan 9, your environemnt >follows you automatically) How many megabytes have to follow me? Sounds like the same problem to me. >Only if you are planning to >randomly access a *small* portion of this database does mapping all 6GB >into your address space make sense. That depends to some extent on what you think your network is. For T1 i would agree with you. On a hypothetical 1Gb network I am no longer so sure. >I will agree though that most present operating systems will choke on >the idea of a single 6GB random-access file -- or a 6GB virtual memory >image. More important, few architectures can accomodate it well. For instance, even 8Kb pages are silly in this case. But they are too big for most other cases. Does the small page/large page split deserve another look? I have seen one VM architecture recently in which the idea of paging is abandoned completely because it makes no sense in a large memory environment. Note that VM was NOT abandoned on this machine. ron -- 1987: We set standards, not Them. Your standard windowing system is NeUWS. 1989: We set standards, not Them. You can have X, but the UI is OpenLock. 1990: Why are you buying all those workstations from Them running Motif?
djh@osc.edu (David Heisterberg) (08/09/90)
In article <13667@cbmvax.commodore.com>, jesup@cbmvax.commodore.com (Randell Jesup) writes: > I submit that your situation is something of an unusual case, and is > likely to remain unusual for at least a decade, perhaps 2. Few machines > (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB > (I've never even seen a file larger than 100MB, even on mainframes). Hang around with some quantum chemists sometime. Files larger than 1GB are routine. In recent years so-called direct SCF methods have become popular (again?) because the study of large molecules results in enormous files for the two-electron integrals. The direct methods simply recalculate the integrals whenever needed. This is ok for simple SCF calculations, and Gaussian 90 will have direct MP2, but direct CI and CC is going to be tough. Realistic calculations could benefit from address spaces (and real memory) of 16 GB or more - there are folks who could use that capability right now. -- David J. Heisterberg djh@osc.edu And you all know The Ohio Supercomputer Center djh@ohstpy.bitnet security Is mortals' Columbus, Ohio 43212 ohstpy::djh chiefest enemy.
kitchel@iuvax.cs.indiana.edu (Sid Kitchel) (08/09/90)
In article <5286@mace.cc.purdue.edu> nvi@mace.cc.purdue.edu (Charles C. Allen) writes: |> I submit that your situation is something of an unusual case, and is |> likely to remain unusual for at least a decade, perhaps 2. Few machines |> (percentage-wise) even have 4 GB of storage, let alone files larger that 4GB |> (I've never even seen a file larger than 100MB, even on mainframes). | |Until recently, the "standard" media for transporting files has been |9-track 6250 tape, which holds around 200M. Until recently, all our |data files were less than 200M (hmm... I hope you see the |correlation). Now that we have some 8mm tape drives, we routinely |have 400M files. We'd have bigger ones, but all our disks are little |SCSI 600-700M thingies (access time is not very critical), and we |can't easily have a single file span volumes. This is for high energy |physics data analysis. Ah, the joy of isolation at Purdue!! Here at Indiana University we have something called Sociology. Some sociologists have developed the nasty habit of investigating the U.S. Census data. Currently I'm working with a group studying a fairly restricted set of county data from the 1970 Census that is 6 tapes long. These are 9-track 6250 bpi tapes. The Census bureau makes standard extracts each census year that are often 20 to 30 tapes long. Yes, Virginia, big files do exist! And our best guess is that some of them will only get bigger. Now if VMS only had tape handling... --Sid -- Sid Kitchel...............WARNING: allergic to smileys and hearts.... Computer Science Dept. kitchel@cs.indiana.edu Indiana University kitchel@iubacs.BITNET Bloomington, Indiana 47405-4101........................(812)855-9226
jr@oglvee.UUCP (Jim Rosenberg) (08/11/90)
In <dczq02zP01Hc01@JUTS.ccc.amdahl.com> rbw00@ccc.amdahl.com ( 213 Richard Wilmot) writes: >I still think it is time to stop trying to use integers for addressing. >They always break down and probably always will. Many computers today have >floating point units. I would like to see floating point used for addressing. Excuse me? You're proposing that addressing be based on *INEXACT* arithmetic?? Sure sounds like a can of worms to me! In scientific programming one has to be careful to test floating point numbers for difference within some epsilon rather than for absolute equality. Not being able to test addresses for exact equality seems like a fatal weakness, IMHO. You could get around this problem by just using the fraction and exponent as "keys" to the address, stripped of their floating point semantic content. But all this does is give you an integer with as many bits as the fraction + the exponent could be. To be assured you could always interpose something between two addressed entities you *need* floating point semantics. And in fact you obviously want them. How are you proposing this would work??? (Alas, this discussion may not belong in comp.databases ...) -- Jim Rosenberg #include <disclaimer.h> --cgh!amanue!oglvee!jr Oglevee Computer Systems / / 151 Oglevee Lane, Connellsville, PA 15425 pitt! ditka! INTERNET: cgh!amanue!oglvee!jr@dsi.com / /
jkrueger@alxfac.UUCP (Jon Krueger) (08/11/90)
dave@fps.com (Dave Smith) writes: >In article <dczq02zP01Hc01@JUTS.ccc.amdahl.com> rbw00@JUTS.ccc.amdahl.com ( 213 Richard Wilmot) writes: >>I still think it is time to stop trying to use integers for addressing. >>They always break down and probably always will. Many computers today have >>floating point units. I would like to see floating point used for addressing. >This won't work. There are only so many distinct numbers representable by >floating point. Richard implied the usual exponent and mantissa representation of floats. One might use two bignums instead. Their ratio represents all rational numbers exactly: arbitrary precision, no overflow, no underflow, no loss of precision. High cost? TANSTAAFL. Consider associative arrays. -- Jon
ge@phoibos.cs.kun.nl (Ge Weijers) (08/13/90)
davecb@yunexus.YorkU.CA (David Collier-Brown) writes: > I confess I'd have **real** trouble selling a raw bullet file system >to a customer doing anything but cad/cam, software development[2] or small >databases. In the context of Amoeba no limit is posed on the number of differently implemented file systems. Bullet is certainly not the solution to all storage problems. It is supposed to support a limited set of operations (e.g. loading code images from disk) very quickly. So you'd sell him a DB file system AND Bullet for the 'normal' files. Directories are stored elsewhere, so the DB files and the other files can still be stored in the same directory. Ge' Ge' Weijers Internet/UUCP: ge@cs.kun.nl Faculty of Mathematics and Computer Science, (uunet.uu.net!cs.kun.nl!ge) University of Nijmegen, Toernooiveld 1 tel. +3180612483 (UTC+1, 6525 ED Nijmegen, the Netherlands UTC+2 march/september
ge@phoibos.cs.kun.nl (Ge Weijers) (08/13/90)
rminnich@super.ORG (Ronald G Minnich) writes:
]And once it gets enough ram we will run out of address bits again! this
]ds 5000 on my desk as 128 mb of memory, and can have 512 mb. That is getting
]uncomfortably close to running out of address bits. I figure we will be there
]in two years, at a bit a year. I guess there was a 32-bit war here before,
]judging by earlier comments, but fact is we are about to run out.
I propose using 256 bits. With one atom/bit storage the universe
will not support more than a few thousand PCs. Of course there should
be an option to use segments (max 2^256 for uniformity's sake).
Ge'
Ge' Weijers Internet/UUCP: ge@cs.kun.nl
Faculty of Mathematics and Computer Science, (uunet.uu.net!cs.kun.nl!ge)
University of Nijmegen, Toernooiveld 1 tel. +3180612483 (UTC+1,
6525 ED Nijmegen, the Netherlands UTC+2 march/september