peralta@pinocchio.Encore.COM (Rick Peralta) (02/07/90)
What are the feelings here regarding 64 bit longs? There are applications and devces that are breaking Gigabyte limits of the 32 bit archetecture and it seems that we will be stuck in the too small address space again. What platforms are using the 64 bit ALUs? Is anyone using 64 bit pointers (yet)? Does anyone have arguments (applications) for using 64 bits? A few that come to mind are: . just plain math resolution . Very large virtual memory . larger disk storage (no joke single volumes will be breaking lseek() soon) - Rick Z[H
truesdel@sun217..nas.nasa.gov (David A. Truesdell) (02/07/90)
peralta@pinocchio.Encore.COM (Rick Peralta) writes: >Does anyone have arguments (applications) for using 64 bits? > >A few that come to mind are: > > . larger disk storage > (no joke single volumes will be breaking lseek() soon) lseek is already "broken" here. I'm in the process of testing a striped filesystem which currently weighs in at 20 GigaBytes, with a production size expected to be 200+ GB. Our fsck needs a special "long lseek" (64 bits) to move around. Also, I believe the cray allows files of more than 2 GB. It already has 64 bit longs. T.T.F.N., dave truesdell (truesdel@prandtl.nas.nasa.gov) "Testing can show the presense of bugs, but not their absence." -- Dijkstra "Each new user of a new system uncovers a new class of bugs." -- Kernighan
peralta@pinocchio.Encore.COM (Rick Peralta) (02/08/90)
In article <4812@amelia.nas.nasa.gov> (David A. Truesdell) writes: >peralta@pinocchio.Encore.COM (Rick Peralta) writes: > >>Does anyone have arguments (applications) for using 64 bits? >> >>A few that come to mind are: >> >> . larger disk storage >> (no joke single volumes will be breaking lseek() soon) > >lseek is already "broken" here. I'm in the process of testing a striped >filesystem which currently weighs in at 20 GigaBytes, with a production size >expected to be 200+ GB. Have you standardized your new seek? I was playing with the idea of implementing bseek(fd, 64bits, whence); It seemed kind of nasty to start with but got quite reasonable quickly. Inside the kernel space the address can be easily broken up into block size (1K-8K) chunks, indexing into the device done and then the remainder of the address can be used to setup the details of the position. This seemed like a nice perk for backward compatibility too. Just have a union for each block size and juggle away. The application could play the same game and not be constricted by the actual block size. Of course things like ftell would break. Maybe the address of the off_t could be passed into bseek and the new offset returned in the variable (gack! I think I've been looking at too much AT&T source). The return value would then be status information. - Rick (Or maybe you are a floating point fan... 8^)
gwyn@smoke.BRL.MIL (Doug Gwyn) (02/08/90)
In article <11071@encore.Encore.COM> peralta@pinocchio.UUCP (Rick Peralta) writes: >What are the feelings here regarding 64 bit longs? This is really a C question, not a UNIX question. 64-bit long integers are just fine. In fact in the kind of environment you describe I'd even say they are preferred. However, you may find that many applications "know" that longs are 32 bits. Such applications are already broken, but market pressure may cause you to cater to them anyway.
scott@bbxsda.UUCP (Scott Amspoker) (02/09/90)
In article <4812@amelia.nas.nasa.gov> (David A. Truesdell) writes: >lseek is already "broken" here. I'm in the process of testing a striped >filesystem which currently weighs in at 20 GigaBytes, with a production size >expected to be 200+ GB. Forgive my ignorance, but, what is a "striped" filesystem? -- Scott Amspoker Basis International, Albuquerque, NM (505) 345-5232 unmvax.cs.unm.edu!bbx!bbxsda!scott
markh@attctc.Dallas.TX.US (Mark Harrison) (02/09/90)
writes: >What are the feelings here regarding 64 bit longs? As Unix tries to get a larger share of the commercial market, We will see a need for storing numeric values with 18-digit precision, ala COBOL and the IBM mainframe. This can be accomplished in 64 bits, and is probably the reason "they" chose 18 digits as their maximum precision. btw, I have always heard 64 bit integers referred to as "xlongs" (extra longs)... is this common or just our own local jargon? Mark Harrison (markh @ attctc)
truesdel@sun217..nas.nasa.gov (David A. Truesdell) (02/10/90)
peralta@pinocchio.Encore.COM (Rick Peralta) writes: >Have you standardized your new seek? It's not exactly a new seek call, it's actually implemented as an ioctl() for the raw device. Amdahl's UTS supports a 64bit "long long" so games don't have to be played with arrays of smaller integers, as is done with select(). Now, if only people didn't write code which assumes a long is the same as an int we could change the compiler to think longs were always 64bits, and a lot of the current limits would simply vanish. T.T.F.N., dave truesdell (truesdel@prandtl.nas.nasa.gov) "Testing can show the presense of bugs, but not their absence." -- Dijkstra "Each new user of a new system uncovers a new class of bugs." -- Kernighan
truesdel@sun217..nas.nasa.gov (David A. Truesdell) (02/10/90)
scott@bbxsda.UUCP (Scott Amspoker) writes: >In article <4812@amelia.nas.nasa.gov> (David A. Truesdell) writes: >>lseek is already "broken" here. I'm in the process of testing a striped >>filesystem which currently weighs in at 20 GigaBytes, with a production size >>expected to be 200+ GB. >Forgive my ignorance, but, what is a "striped" filesystem? A striped (or stripeing) filesystem is one in which the filesystem is spread out over a set of disks in order to increase capacity and/or performance and/or reliability. The filesystem I'm testing would be classed as "level 5 RAID". (That's "Redundant Array of Inexpensive Disks", too bad our disks can't really be called "Inexpensive".) You can check out the September '89 (v7i9) issue of UNIX Review, which has an article ("Winged Memory") which covers the ideas behind RAID, and the different classes of RAID filesystems. T.T.F.N., dave truesdell (truesdel@prandtl.nas.nasa.gov) "Testing can show the presense of bugs, but not their absence." -- Dijkstra "Each new user of a new system uncovers a new class of bugs." -- Kernighan
truesdel@sun217..nas.nasa.gov (David A. Truesdell) (02/10/90)
markh@attctc.Dallas.TX.US (Mark Harrison) writes: >btw, I have always heard 64 bit integers referred to as "xlongs" (extra >longs)... is this common or just our own local jargon? Uts calls a 64bit integer a "long long". On a Cray, it is simply a "long". I doubt if there is any truely "common" term for a type that's longer than a long. T.T.F.N., dave truesdell (truesdel@prandtl.nas.nasa.gov) "Testing can show the presense of bugs, but not their absence." -- Dijkstra "Each new user of a new system uncovers a new class of bugs." -- Kernighan
jfh@rpp386.cactus.org (John F. Haugh II) (02/11/90)
In article <4849@amelia.nas.nasa.gov> truesdel@sun217..nas.nasa.gov (David A. Truesdell) writes: >A striped (or stripeing) filesystem is one in which the filesystem is spread >out over a set of disks in order to increase capacity and/or performance and/or >reliability. The filesystem I'm testing would be classed as "level 5 RAID". >(That's "Redundant Array of Inexpensive Disks", too bad our disks can't really >be called "Inexpensive".) I think you've described three different types of file system schemes. Striping, from what I've seen, refers to laying consecutive cylinders out on consecutive drives so that a seek on one drive can occur at the same time as the transfer on the next drive, thus, seeks are free for sequential reads. Another strategy is mirroring, which puts redundant copies of the data on one or more drives [ usually more than one ] to increase the realiability of the data. A drive system with two 50,000Hr MTBF drives mirroring each other would have a MTBF of decades or centuries instead of years. A failed drive could be powered down and replaced without the need to re-boot the entire system, provided the hardware permitted drive replacement with the power on. The simplest reason to use more than one drive is to create a filesystem larger than any of the single drives involved. I've seen this refered to as "spanning". The beginning of one drive is the logical end of the previous drive. Thus, two 250MB drives could be combined to make a single 500MB logical drive, and so one. Device drivers for all of these schemes are fairly trivial once the underlying physical device driver is written. -- John F. Haugh II UUCP: ...!cs.utexas.edu!rpp386!jfh Ma Bell: (512) 832-8832 Domain: jfh@rpp386.cactus.org
ejp@bohra.cpg.oz (Esmond Pitt) (02/12/90)
In article <11372@attctc.Dallas.TX.US> markh@attctc.Dallas.TX.US (Mark Harrison) writes: > >As Unix tries to get a larger share of the commercial market, We will see >a need for storing numeric values with 18-digit precision, ala COBOL and >the IBM mainframe. This can be accomplished in 64 bits, and is probably >the reason "they" chose 18 digits as their maximum precision. According to a fellow who had been on the original IBM project in the fifties, the 18 digits came about because of using BCD (4-bit decimal) representation, in two 36-bit words. -- Esmond Pitt, Computer Power Group ejp@bohra.cpg.oz
rcd@ico.isc.com (Dick Dunn) (02/13/90)
peralta@pinocchio.Encore.COM (Rick Peralta) writes: > What are the feelings here regarding 64 bit longs? . . . > . larger disk storage > (no joke single volumes will be breaking lseek() soon) Files are already breaking a 32-bit lseek pointer. But shouldn't that one be tackled differently? The second argument to lseek should be an off_t, not a long (and certainly not an int, as some have tried to inflict on us). Perhaps the appearance of real uses for 64-bit integers and/or pointers should cause us to think a little harder about the problems we've created in the past. The 16->32 transition was painful enough. (Note that the "64-bit" discussion is also going on in comp.arch.) -- Dick Dunn rcd@ico.isc.com uucp: {ncar,nbires}!ico!rcd (303)449-2870 ...Mr. Natural says, "Use the right tool for the job."
truesdel@sun217..nas.nasa.gov (David A. Truesdell) (02/13/90)
jfh@rpp386.cactus.org (John F. Haugh II) writes: >In article <4849@amelia.nas.nasa.gov> truesdel@sun217..nas.nasa.gov (David A. Truesdell) writes: >>A striped (or stripeing) filesystem is one in which the filesystem is spread >>out over a set of disks in order to increase capacity and/or performance and/or >>reliability. >I think you've described three different types of file system schemes. No, there are a lot of different filesystem schemes which can display these same attributes (capacity, performance, reliability) to differing degrees. >Striping, from what I've seen, refers to laying consecutive cylinders out >on consecutive drives so that a seek on one drive can occur at the same >time as the transfer on the next drive, thus, seeks are free for sequential >reads. Another variation can place consecutive blocks on drives with different data paths which can increase the I/O transfer rate above that of an individual drive (or data path). Seeks would be concurrent, too. >Another strategy is mirroring, which puts redundant copies of the data >on one or more drives [ usually more than one ] to increase the realiability >of the data. A drive system with two 50,000Hr MTBF drives mirroring each >other would have a MTBF of decades or centuries instead of years. A failed >drive could be powered down and replaced without the need to re-boot the >entire system, provided the hardware permitted drive replacement with the >power on. A "shadowed", or "mirrored", filesystem is very reliable, however, for a large site this can become quite expensive. Imagine having to buy twice (or more) the amount of disk in order to hold all your data. Other variations of RAID filesystems (a mirror disk is classed as a "Level 1" RAID) can employ error correction techniques to obtain more than adequate reliability, without wasting 50% of your disk capacity. In addition, a mirrored filesystem won't help your I/O throughput. The equation below shows how to calculate the effective MTBF for a multi-disk filesystem. The variables are: the MTBF of a disk (MTBFdisk), the mean time to repair for a disk (MTTRdisk), the number of data disks (#data) and the number of disks with redundant data (#ecc). ( MTBFdisk ) ^ 2 MTBFfs = -------------------------------- #data(#data + #ecc) * MTTRdisk >The simplest reason to use more than one drive is to create a filesystem >larger than any of the single drives involved. I've seen this refered to as >"spanning". The beginning of one drive is the logical end of the previous >drive. Thus, two 250MB drives could be combined to make a single 500MB >logical drive, and so one. However, this simple approach is not without its own risks. If redundant information is not kept, the equation above degenerates into: MTBFdisk MTBFfs = ---------- #data So if you use your 50,000 hour MTBF disks, your filesystem ends up with a MTBF of 25,000 hours. And the more disks you add, the worse it gets. Try working out the numbers for yourself. Consider a filesystem which you want to span 11 disks. A striped filesystem, with a single ecc disk, would require a total of 12 drives. Using 50000 hours as the MTBF, and 10 hours for the time to repair, you get a mean time between failure for the filesystem of 1,893,939 hours (or 216 years). A mirrored filesystem (spanning the disks) of the same capacity would require a total of 22 drives, and would have a MTBF of 1,033,057 hours (or 117 years). For the worst case, a simple "spanned" filesystem would require only 11 disks, but would have a MTBF of 4,545 hours, or 189 DAYS. T.T.F.N., dave truesdell (truesdel@prandtl.nas.nasa.gov) "Testing can show the presense of bugs, but not their absence." -- Dijkstra "Each new user of a new system uncovers a new class of bugs." -- Kernighan
kak@hico2.UUCP (Kris A. Kugel) (02/19/90)
In article <11372@attctc.Dallas.TX.US>, markh@attctc.Dallas.TX.US (Mark Harrison) writes: > writes: > >What are the feelings here regarding 64 bit longs? > > As Unix tries to get a larger share of the commercial market, We will see > a need for storing numeric values with 18-digit precision, ala COBOL and > the IBM mainframe. > > btw, I have always heard 64 bit integers referred to as "xlongs" (extra > longs)... is this common or just our own local jargon? > > Mark Harrison > (markh @ attctc) We are starting to have problems because of the wide variety of wordsizes on the machines UNIX runs on. Does it make sense that a long is such a different size on different machines? What if you want a guarenteed precision? I'm beginning to think that some kind of declaration construct like int(need32) var; is needed. The layout of structures is another problem; my friends at NETWISE seem to think that they have a solution, but it seems to me to make more sense to be able to specify exact layout, good over ALL machines, than to translate every message sent over a hetrogenous network. But this means language support. Isn't it about time we bit the bullet and decided that the C language needs to support types, structures, and ints that look the same from one machine to another? We are only going to network more in the future, not less. Kris A. Kugel {uunet,att,rutgers}!westmark!hico2!kak <--daily ssbn!hico2!kak <--semi-daily
gwyn@smoke.BRL.MIL (Doug Gwyn) (02/21/90)
In article <194@hico2.UUCP> kak@hico2.UUCP (Kris A. Kugel) writes: >We are starting to have problems because of the wide variety of >wordsizes on the machines UNIX runs on. Does it make sense that >a long is such a different size on different machines? What if >you want a guarenteed precision? I'm beginning to think that >some kind of declaration construct like int(need32) var; is needed. This is not a UNIX issue, it's a programming language issue. For C, the answer is, yes it DOES make sense to allow the implementation to take into account the characteristics of the system it runs on. There are ways in C to program portably; use them. >We are only going to network more in the future, not less. We're well aware of the issues you raised. They are not properly solved by tacking inadequate kludges onto programming languages.
johnl@gronk.UUCP (John Limpert) (02/21/90)
In article <194@hico2.UUCP> kak@hico2.UUCP (Kris A. Kugel) writes: >We are starting to have problems because of the wide variety of >wordsizes on the machines UNIX runs on. Does it make sense that >a long is such a different size on different machines? What if >you want a guarenteed precision? I'm beginning to think that >some kind of declaration construct like int(need32) var; is needed. Sounds like you want declarations like those in PL/I or ADA. I think it would be a real bucket of worms for compiler developers. C is primarily a systems programming language, it makes no attempt to hide the hardware from the programmer. The virtual machine philosophy used by some programming languages just isn't appropriate for C. Typedefs and defines can be used to match native machine types to the needs of the program. >The layout of structures is another problem; my friends at NETWISE >seem to think that they have a solution, but it seems to me to >make more sense to be able to specify exact layout, good over >ALL machines, than to translate every message sent over a >hetrogenous network. But this means language support. Isn't it >about time we bit the bullet and decided that the C language needs >to support types, structures, and ints that look the same from one >machine to another? We are only going to network more in the future, >not less. This would cause big problems for machines that differ significantly from the architecture of the proposed 'C virtual machine'. I expect C to give me efficient use of the hardware. If I wanted portability at any cost then I would use ADA. Suggestions of this sort seem to come up with regularity. Don't try to change C into some nice, safe, portable programming language with all sharp edges removed, pick another language. -- John Limpert johnl@gronk.UUCP uunet!n3dmc!gronk!johnl
peralta@pinocchio.Encore.COM (Rick Peralta) (02/22/90)
In article <194@hico2.UUCP> kak@hico2.UUCP (Kris A. Kugel) writes: >> >What are the feelings here regarding 64 bit longs? > >We are starting to have problems because of the wide variety of >wordsizes on the machines UNIX runs on. Does it make sense that >a long is such a different size on different machines? What if >you want a guarenteed precision? I'm beginning to think that >some kind of declaration construct like int(need32) var; is needed. >The layout of structures is another problem; >Isn't it about time we bit the bullet and decided that the C language >needs to support types, structures, and ints that look the same from >one machine to another? Standardizing makes infinite sense, but is a logistical monster. Maybe a switch that can regress to the "old way" (whatever that is) and defaults to a new standard can be managed. As for required sizes there is a mechanism: int x:32; Byte ordering is a real issue. Casting or declaring a type to have a particular byte order seems wonderful, 'till you look at what it does to the compiler folks. They will have to convert every data item's byte order for each operation. (How about: 1234 int x; (1234) x++; (4321) x--;) Since we're getting the compiler people excited, why not have some fun... Why can't we have the compiler manage math sizes other than are supported in the current hardware. For example: a 16 bit machine with 32 or 64 bit math. If the hardware is inadequate, just call a library or inline the code. That way math code would no longer be functionally limited by the hardware. - Rick "But it should be put on the standards list..."
guy@auspex.auspex.com (Guy Harris) (02/23/90)
>Isn't it about time we bit the bullet and decided that the C language needs >to support types, structures, and ints that look the same from one >machine to another? We are only going to network more in the future, >not less. I hate to have to break the sad news to you, but size and structure layout aren't the only issues here. Byte order is another issue, and floating-point format is still another. No matter *how* much we network in the future, it's not at all clear that the problem can be properly "fixed" by changing the language so that you can tell compiler to do things with data structures and members thereof....
eloranta@tukki.jyu.fi (Jussi Eloranta) (02/25/90)
I just compiled gnulib2 for gcc (the 64-bit library). How do I access it? (ie. how do I tell when I want to have 32 or 64 bit integers?) I didn't find anything from the docs.. Thanks, jussi -- ============================================================================ Jussi Eloranta Internet(/Bitnet): University of Jyvaskyla, eloranta@tukki.jyu.fi Finland [128.214.7.5]
meissner@osf.org (Michael Meissner) (02/27/90)
In article <3538@tukki.jyu.fi> eloranta@tukki.jyu.fi (Jussi Eloranta) writes: | I just compiled gnulib2 for gcc (the 64-bit library). How do I access it? | (ie. how do I tell when I want to have 32 or 64 bit integers?) | | I didn't find anything from the docs.. Yeah, it isn't in the docs. To use a 64-bit type, just use: long long which seems to be a convention between several compilers. Make sure gnulib is on the link line (the simplest way is to use gcc to link the programs). -- Michael Meissner email: meissner@osf.org phone: 617-621-8861 Open Software Foundation, 11 Cambridge Center, Cambridge, MA Catproof is an oxymoron, Childproof is nearly so