dave@astra.necisa.oz (Dave Horsfall) (06/25/87)
Can anyone tell me in an unambiguous manner just how many bytes are in the following System V (2.?) "blocks"? Make any assumptions uou like, if it makes a difference. I have yet to see a clear reference on this matter. (BSD people stop laughing, I'll bet you're not much better off!) BUFSIZ ls -s du df tar -b cpio -B mkfs fsck fsdb and so forth - I'm sure I missed some ... -- Dave Horsfall (VK2KFU) TEL: +61 2 438-3544 FAX: +61 2 439-7036 NEC Information Systems Aust. ACS: dave@astra.necisa.oz (also CSNET) 3rd Floor, 99 Nicholson St ARPA: dave%astra.necisa.oz@seismo.css.gov St. Leonards NSW 2064 UUCP: {enea,hplabs,mcvax,prlb2,seismo,ukc}!\ AUSTRALIA munnari!astra.necisa.oz!dave
mats@forbrk.UUCP (Mats Wichmann) (06/30/87)
In article <218@astra.necisa.oz> dave@astra.necisa.oz (Dave Horsfall) writes: >Can anyone tell me in an unambiguous manner just how many bytes >are in the following System V (2.?) "blocks"? > ls -s du df tar -b cpio -B mkfs fsck fsdb All of the preceding are 512 byte "blocks" and refer to "disk" blocks; it is left at 512 to avoid having to change things around on systems which support different logical block sizes on different file systems and just for general consistency (let's see, this is a Frozzboz 1000, the blocks must 875 bytes each, unlike the 1500, where they are 950 each...). > BUFSIZ This is the stdio buffer size and varies from system to system, although it seems to be 1024 for most V.2 implementations - should be the same size as the largest allowable file system logical block. Note that there are programs, such as CPIO, which take an argument (-B in the case of cpio) which seems to indicate that the block size is changed; really this sets the "blocking factor" - how many blocks to collect before doing the physical write/read. The number reported by cpio when it finishes is still in terms of 512-byte blocks. Mats Wichmann
allbery@ncoast.UUCP (Brandon Allbery) (07/01/87)
As quoted from <218@astra.necisa.oz> by dave@astra.necisa.oz (Dave Horsfall): +--------------- | Can anyone tell me in an unambiguous manner just how many bytes | are in the following System V (2.?) "blocks"? Make any assumptions | uou like, if it makes a difference. I have yet to see a clear | reference on this matter. (BSD people stop laughing, I'll bet you're | not much better off!) +--------------- There is no single "block" size. Instead, there are "tape blocks" -- usually 512 bytes for historical reasons (and thereby compatibility) and "disk blocks" (which used to be 512 bytes but on most modern systems are 1024). If you have an improperly ported System V (or a clone System V that was never within 10 miles of AT&T since it was V7 compatible), you may have 512-byte blocks no matter what, or some/all of the programs may not have been changed to reflect the new block size. WOrse, System V can handle both kinds of filesystems, so you may have a partition with 512-byte blocks and one with 1024-byte blocks.... In general, tape-oriented utilities (tar, dd, cpio) use tape blocks (512 bytes) and disk utilities (including stdio) use 1024-byte disk blocks even on 512-byte block file systems. (Low-level utilities like mkfs and fsck/fsdb will use the actual block size of the file system.) ++Brandon -- ---- Moderator for comp.sources.misc and comp.binaries.ibm.pc ---- Brandon S. Allbery <BACKBONE>!cbosgd!ncoast!allbery aXcess Company {ames,mit-eddie,harvard,talcott}!necntc!ncoast!allbery 6615 Center St. #A1-105 {well,sun,pyramid,ihnp4}!hoptoad!ncoast!allbery Mentor, OH 44060-4101 necntc!ncoast!allbery@harvard.HARVARD.EDU (Internet) +01 216 974 9210 ncoast!allbery@CWRU.EDU (CSnet) Brandon Allbery on 157/504 (Fidonet/Matrix/whatever)
daryl@ihlpe.UUCP (07/09/87)
In article <348@forbrk.UUCP>, mats@forbrk.UUCP (Mats Wichmann) writes: > In article <218@astra.necisa.oz> dave@astra.necisa.oz (Dave Horsfall) writes: > >Can anyone tell me in an unambiguous manner just how many bytes > >are in the following System V (2.?) "blocks"? > > > ls -s du df tar -b cpio -B mkfs fsck fsdb > All of the preceding are 512 byte "blocks" and refer to "disk" blocks; > > BUFSIZ > This is the stdio buffer size and varies from system to system, although > it seems to be 1024 for most V.2 implementations - should be the same size UTS block sizes on our machines differ from "normal" VAX/sun/most things. To this day I am not sure why we cannot use either bytes, Kbytes, Mbytes, etc. instead of blocks. Daryl Monge UUCP: ...!ihnp4!ihcae!dlm AT&T CIS: 72717,65 Bell Labs, Naperville, Ill AT&T 312-979-3603
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/10/87)
In article <1852@ihlpe.ATT.COM> daryl@ihlpe.ATT.COM (Daryl Monge) writes: >To this day I am not sure why we cannot use either bytes, Kbytes, Mbytes, etc. >instead of blocks. How big is a "byte"? (No, it's not necessarily 8 bits!) How about sizing things in terms of number of bits, which is a universal measure.
rjd@tiger.UUCP (07/13/87)
>>To this day I am not sure why we cannot use either bytes, Kbytes, Mbytes, etc. >>instead of blocks. > > How big is a "byte"? (No, it's not necessarily 8 bits!) > How about sizing things in terms of number of bits, which is a > universal measure. O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight bits!!! Maybe you are thinking of a word?? And a nibble is four bits, and a gulp is sixteen bits (or was this a mouthful?), etc.... Randy
guy%gorodish@Sun.COM (Guy Harris) (07/14/87)
> O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight > bits!!! Don't say that within earshot of any PDP-10 aficionados.... Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
roy@phri.UUCP (Roy Smith) (07/15/87)
In article <142700010@tiger.UUCP> rjd@tiger.UUCP writes: > O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight bits!!! > Maybe you are thinking of a word?? And a nibble is four bits, and a gulp is > sixteen bits (or was this a mouthful?), etc.... No, no, no, a thousand times NO! A byte is NOT NECESSARILY 8 bits! Granted, on most of the popular machines you are likely to see today (Vax, PDP-11, 680x0, 320xx, 80x86, Pyramid, etc, a byte is 8 bits, but that doesn't mean it has to be. A byte is simply some collection of contigious bits taken as a unit. Often a byte is that number of bits which most comfortably holds a single character in the machine's native character code, but not always. Often the number of bits in a byte is dictated by the underlying machine architecture, but that's not a hard and fast rule either. I could write a program on a Vax to read a file in 7-bit bytes if I wanted to. In fact, if I wanted to read DEC-10 tapes I would have to write just such a program (and I once did). On a DEC-10/20, for example, a byte can reasonably be anything from 1 (0?) to 36 (35?) bits; 6, 7, and 9 bit bytes are all quite common and if anything, I would say an 8-bit byte on a DEC-10/20 is a mite strange. I'm not sure byte even has a real meaning on a machine like a Cray. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
malcolm@spar.SPAR.SLB.COM (Malcolm Slaney) (07/15/87)
In article <142700010@tiger.UUCP> rjd@tiger.UUCP writes: > >> How big is a "byte"? (No, it's not necessarily 8 bits!) > > O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight > bits!!! You and I both know this....but tell that to the Common Lisp people. In "Common Lisp, The Language" by Guy Steele, 1984. (page 225) Several functions are provided for dealing with an arbitrary- width field of contiguous bits appearing anywhere in an integer/ Such a contiguous set of bits is called a "byte". Here the term "byte" does not imply some fixed number of bits (such as eight) rather a field of arbitrary and user-specifiable width. ARGGHHHHH.....Talk about making it difficult to move software between a Symbolics machine (which is where the screwy standard came from, I think) and a Unix machine. Malcolm
melohn%sluggo@Sun.COM (Bill Melohn) (07/15/87)
In article <23436@sun.uucp> guy%gorodish@Sun.COM (Guy Harris) writes: >> O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight >> bits!!! > >Don't say that within earshot of any PDP-10 aficionados.... Yes, you've clearly bit off more than you can chew. An Octet (as described in the TCP/IP RFCs) IS 8 bits; bytes are arbitrary both in size and order within the large variety of machine architectures.
davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) (07/15/87)
In article <2792@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: >In article <142700010@tiger.UUCP> rjd@tiger.UUCP writes: >> O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight bits!!! >> Maybe you are thinking of a word?? And a nibble is four bits, and a gulp is >> sixteen bits (or was this a mouthful?), etc.... > Let me clarify this: 8 bits is a byte 4 bits is a nybble 2 bits is a tayste (actually 2 bits is a quarter) 36 bit machines usually support at least 6 and 9 bit bytes in hardware, although I'm sure someone will write and tell me that their machine is not only obsolete but brain-damaged as well and doesn't have any hardware bytes. 36 bit machines were a great idea which fell by the wayside... the extra bit in the byte allows many extended character sets (ASCII + 384 others), the short is +/-262144, large enough for many applications, and the long is +/-64*10^9, which will hold almost any real world value. When most of our applications were moved from a Honeywell to vaxen and an IBM, we did a lot of conversion to long, double, and real*8, because the number of significant digits dropped to <1. ================================================================ | Please any followup discussion of archetecture to | | comp.arch not wizards! | ================================================================ -- bill davidsen (wedu@ge-crd.arpa) {chinet | philabs | sesimo}!steinmetz!crdos1!davidsen "Stupidity, like virtue, is its own reward" -me
jhh@ihlpl.ATT.COM (Haller) (07/16/87)
In article <2792@phri.UUCP>, roy@phri.UUCP (Roy Smith) writes: > No, no, no, a thousand times NO! A byte is NOT NECESSARILY 8 bits! > > On a DEC-10/20, for example, a byte can reasonably be anything from > 1 (0?) to 36 (35?) bits; 6, 7, and 9 bit bytes are all quite common and if > anything, I would say an 8-bit byte on a DEC-10/20 is a mite strange. I'm > not sure byte even has a real meaning on a machine like a Cray. > Roy Smith, {allegra,cmcl2,philabs}!phri!roy This is why the standards organizations use the term octet rather than byte. Almost all data networks, and certainly all of the protocol information (headers, etc) are octet aligned, making life very difficult for those manufacturers with "wierd" machines. Unfortunately, mega-octets and giga-octets doesn't have quite as nice a ring as megabyte and gigabyte. John Haller
Isaac_K_Rabinovitch@cup.portal.com (07/16/87)
>> O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight >> bits!!! > >Don't say that within earshot of any PDP-10 aficionados.... > Guy Harris > {ihnp4, decvax, seismo, decwrl, ...}!sun!guy > guy@sun.com No, the basic unit on a PDP 10 is not a "byte" it's a "word". "Word" was universal nomenclature for unit of data before IBM introduced the 360, the first byte-oriented machine. An old IBMer once told me that "byte" was a Olde English word meaning "syllable". Never been able to confirm this.
devine@vianet.UUCP (Bob Devine) (07/16/87)
In article <2792@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: > Maybe you are thinking of a word?? And a nibble is four bits, and a gulp is > sixteen bits (or was this a mouthful?), etc.... In article <6705@steinmetz.steinmetz.UUCP>, davidsen@steinmetz.steinmetz.UUCP (William E. Davidsen Jr) writes: > 8 bits is a byte > 4 bits is a nybble > 2 bits is a tayste (actually 2 bits is a quarter) This a reposting of the results of a question I asked last year. I had asked what to a grouping of bits. It all started because I originally thought that "crayte" would be a marvelous name for a 64-bit group. Bob Devine +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 bit == bit[?], byt[1], singlet[4] 2 bits == quarter[0], dibit[4], doublet[4] 4 bits == nybble[?], nibble[4], quadlet[4] 8 bits == byte[?], octlet[4] 16 bits == gulp[2], dysh[3], hexlet[4], playte[5], gulp[6], snack[7,8], chomp[9] 32 bits == box[2], coarse[3], triclet[4], plattyr[5], mouthful[6], meal[7,8], snarf[9] 64 bits == crayte[0], meel[3], sexlet[4], feast[7,8], gobble[9] * bits == buffet[3] Contributors: [?] unknown [0] vianet!devine (Bob Devine) [1] uiucdcs!mcewan (Scott McEwan) [2] ima!haddock!karl (Karl W. Z. Heuer) [3] iuvax!bobmon (Robert Montante) [4] ccvaxa!aglew (Andy "Krazy" Glew) [5] sphinx!eric (Eric M. Nelson) [6] reed!jeanne (Jeanne A. E. DeVoto) [7] uu.warwick.ac.uk!kay (Kay Dekker) [8] necis!schuldy (Mark Schuldenfrei) [9] decuac!bagwill (Bob Bagwill)
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/16/87)
In article <142700010@tiger.UUCP> rjd@tiger.UUCP writes: >A byte IS eight bits!!! No, that started with IBM's System/360 and gained further support from the PDP-11. Before that time, and since, many architectures have either had other fundamental address unit sizes (e.g. 6 or 9 bits) or have supported variable-sized bytes (e.g. CDC). An 8-bit byte is simply not a suitable unit of measure for systems whose fundamental memory unit size is not an integral multiple of 8 bits.
guy%gorodish@Sun.COM (Guy Harris) (07/17/87)
> No, the basic unit on a PDP 10 is not a "byte" it's a "word". I didn't say a byte was the *basic* unit of memory on the 10. It most definitely *did* have the notion of a "byte" in the instruction set, however (consider the Load Byte, Store Byte, Increment Byte Pointer, etc. instructions). Byte pointers indicated the size of the byte, so there was no single byte size in the hardware; I think the original software packed 5 7-bit bytes in a word, with one bit left over. > "Word" was universal nomenclature for unit of data before IBM introduced > the 360, the first byte-oriented machine. Not quite. The IBM 7030 or "Stretch" supported bit addressing; it used an 8-bit byte to store characters. I don't know if they used the term "byte", but it definitely supported access to bytes. (And, if you don't want to consider character-oriented machines like the 14xx series to be "byte-oriented", it's still byte-oriented; Stretch was not one of those machines.) I suspect there were other machines of the general flavor of the 360 out before the 360, as well. Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.com
dgk@ulysses.homer.nj.att.com (David Korn[eww]) (07/17/87)
I believe that in the mid-seventies the CDC-STAR used the term sword (super-word) to refer to a 512-bit quantum. At the time I remember that a 1024 bit word was going to be called a pen for obvious reasons. I have not heard these terms used since. Maybe Multi-flow was words form them as well. David Korn {ihnp4|allegra}ulysses!dgk
henry@utzoo.UUCP (Henry Spencer) (07/17/87)
> On a DEC-10/20, for example, a byte can reasonably be anything from > 1 (0?) to 36 (35?) bits; 6, 7, and 9 bit bytes are all quite common... Another example worth mentioning is the BBN C/70 and its kin, which have 10-bit bytes as I recall. This isn't quite the same situation as the DEC-20, which has 36-bit words and a rather fuzzy notion of (sort of) bitfields within them; on the C/70, the division of memory into bytes is just as fixed as it is on (say) a VAX, but the bytes are 10 bits wide, no more, no less. There are also machines with 9-bit bytes, although one seldom sees them in the Unix world. And then there's the PDP-8, where you get your choice of 12-bit bytes (ugh) or 6-bit bytes (ARGH)... -- Support sustained spaceflight: fight | Henry Spencer @ U of Toronto Zoology the soi-disant "Planetary Society"! | {allegra,ihnp4,decvax,utai}!utzoo!henry
rjd@tiger.UUCP (07/18/87)
>> O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight bits!!! >> Maybe you are thinking of a word?? And a nibble is four bits, and a gulp is >> sixteen bits (or was this a mouthful?), etc.... > > No, no, no, a thousand times NO! A byte is NOT NECESSARILY 8 bits! > .... more on this.... You sound convincing, and I would like to think that you were right, but I still have my doubts. The way you are describing a byte: "....A byte is simply some collection of contigious bits taken as a unit. Often a byte is that number of bits which most comfortably holds a single character in the machine's native character code, but not always. Often the number of bits in a byte is dictated by the underlying machine architecture, but that's not a hard and fast rule either." This is a word!! On the machines I most commonly work on, even at the hardware design level, the word size is 32-bit (true 32-bit), and have memory sizes specified in bytes - 8-bit bytes!! The machine uses ASCII, as do most except IBM, and ASCII is based on seven bits. So there would be no reason to use a byte meaning 8-bits unless it WAS so. I HAVE AN IDEA!!! Lets look it up........ (turning pages on my Webster's): byte - n. [arbitrary formation, < BITE ] a string of binary digits, usually eight, operated on as a basic unit by a digital computer. word - ...... 8. an ordered combination of characters carrying at least one meaning that is stored in one location in a computer and that is regarded as a unit when stored or transferred by the computers circuits. I guess you are right, yet I think that common usage dictates a byte be eight bits. A very good point you have brought up, though, as I thought I KNEW a byte to ONLY be eight bits, and there seems to be a point of ambiguity here.... Randy
bzs@bu-cs.BU.EDU (Barry Shein) (07/20/87)
Posting-Front-End: GNU Emacs 18.41.4 of Mon Mar 23 1987 on bu-cs (berkeley-unix) Some more suggestions: In honor of the page size of a Vax: 512 bits == nanopage 1024 should be called a Kbit, there's just no choice, sorry, not funny. -Barry Shein, Boston University
roy@phri.UUCP (Roy Smith) (07/20/87)
Somebody, somewhere, some time ago, in some article wrote: > How about sizing things in terms of number of bits, which is a > universal measure. This unleashed a torrent of silly and not-so-silly articles on the definition of a byte and cute names for N-bit chunks (to which I confess contributing), but nobody really addressed this guy's question. So... Yes, clearly bits is a more precise unit of information than bytes. The problem with reporting file sizes in bits is that most of the time it's not what people want to know. What they really want to know is how many characters long the file is (notice I said characters, not bytes). If I took a text file on a vax that was N characters long and moved it to a DEC-20, it would still be N characters long. Maybe my Vax uses 8 bits per character and your DEC-20 uses 7-1/5 bits per character, but I don't want to know about that (usually). On Unix, ls would show a byte count and on TOPS-20, DIR would show a word count. These numeric values would be different, but the number of characters wouldn't have changed. Well, maybe that's a bad example because TOPS-20 would turn all the newlines into carriage-return/newline pairs, but you get the idea. -- Roy Smith, {allegra,cmcl2,philabs}!phri!roy System Administrator, Public Health Research Institute 455 First Avenue, New York, NY 10016
Isaac_K_Rabinovitch@cup.portal.com (07/20/87)
According to the Oxford English Dictionary Supplement, the "byte" is simply and purely an IBM invention. So it means whatever IBM says it does.
dhesi@bsu-cs.UUCP (Rahul Dhesi) (07/20/87)
In article <2792@phri.UUCP> roy@phri.UUCP (Roy Smith) writes: >In article <142700010@tiger.UUCP> rjd@tiger.UUCP writes: >> O.K., I'll byte. (oops, pun initially unintended.) A byte IS eight bits!!! >> Maybe you are thinking of a word?? And a nibble is four bits, and a gulp is >> sixteen bits (or was this a mouthful?), etc.... > > No, no, no, a thousand times NO! A byte is NOT NECESSARILY 8 bits! >Granted, on most of the popular machines you are likely to see today (Vax, >PDP-11, 680x0, 320xx, 80x86, Pyramid, etc, a byte is 8 bits, but that >doesn't mean it has to be. A byte is simply some collection of contigious >bits taken as a unit. On modern machines, a byte is 8 bits. On obsolete hardware a byte can be of arbitrary size. Since we are now in the 1980s going on to the 1990s, I think it's about time we streamlined our terminlogy to reflect the times. A byte is therefore exactly 8 bits. No more and no less. Opinions to the contrary belong in the 1960s. Let them lie there and die there. It's time to upgrade from your 12-bit PDP-8 or your 60-bit CDC or your 36-bit DEC-20 to a new architecture. In his book "Reliable Data Structures in C", Thomas Plum gives portable implementations of the memxxx functions (e.g. memset(), memcpy()). He does not feel the need to point out that these are portable only if the machine's word will hold exactly an integral number of chars. If you are moving with the times, welcome aboard. If you have your feet firmly planted in the 1960s, lots of luck; you will need it. -- Rahul Dhesi UUCP: {ihnp4,seismo}!{iuvax,pur-ee}!bsu-cs!dhesi
daryl@ihlpe.ATT.COM (Daryl Monge) (07/21/87)
I think that everyone is getting side tracked on the original issue; that is what is a block as reported by many UNIX utilities? Clearly "block" is not useful, especially as file systems get more complex and the notion of a "block" gets confused. However, "bit" is useless in terms of user friendliness. Imagine: -rwxr-x--- 1 daryl daryl 3102120bits Feb 6 22:40 gmacs The number of bits in a byte is not relevant. Perhaps we should use the word (:-) "character", since at least to me that has some real world meaning. ex: /e31 (/dev/dsk/36bs2): 12632K characters 33572 unique files Comments? Daryl Monge UUCP: ...!ihnp4!ihcae!daryl AT&T CIS: 72717,65 Bell Labs, Naperville, Ill AT&T 312-979-3603
beattie@netxcom.UUCP (Brian Beattie) (07/21/87)
As I recall the term "byte" was coined by IBM for the 1401. The 1401 had a variable length word delimited by a "word mark". A "byte" was the smallest addresable object. Though usage it has come to mean the smallest object larger than 1 bit but less than a "word" that can be manipulated by the CPU (a word being the "natural object" of the CPU). PS. the 1401 had a 8bit byte. -- ----------------------------------------------------------------------- Brian Beattie | Phone: (703)749-2365 NetExpress Communications, Inc. | uucp: seismo!sundc!netxcom!beattie 1953 Gallows Road, Suite 300 | Vienna,VA 22180 |
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/21/87)
In article <857@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >A byte is therefore exactly 8 bits. No more and no less. Opinions to >the contrary belong in the 1960s. Let them lie there and die there. And people who believe that 8 bits is sufficiently to encode a character are either naive or stupid.
beede@hubcap.UUCP (Mike Beede) (07/22/87)
in article <857@bsu-cs.UUCP>, dhesi@bsu-cs.UUCP (Rahul Dhesi) says: > > [ much deleted ] > > Since we are now in the 1980s going on to the 1990s, I think it's about > time we streamlined our terminlogy to reflect the times. > > A byte is therefore exactly 8 bits. No more and no less. Opinions to > the contrary belong in the 1960s. Let them lie there and die there. > > [ more deleted ] I suppose that you've allowed for all possible increases in character set size, possibly including fonts encoded on a per-character basis? And any advances in technology, too? While we're at it, let's standardize on whatever machine you like, with all the ``modern'' features, and get rid of all these other nasty architectures with their own ideosyncratic features. Oh well, all right :-> / 2. Seriously--different machines serve different purposes, and so are designed differently. That is why it is foolish to freeze some design parameter arbitrarily. I don't see that there is, for instance, a clear argument against 36 bit words and 9 bit bytes as opposed to 32 bit words and 8 bit bytes, especially if your application works well with 9 bit quantities. -- Mike Beede Computer Science Dept. UUCP: . . . !gatech!hubcap!beede Clemson University INET: beede@hubcap.clemson.edu Clemson SC 29631-1906 YOUR DIME: (803)656-{2845,3444}
seifert@doghouse.gwd.tek.com (Snoopy) (07/22/87)
In article <9815@bu-cs.BU.EDU> bzs@bu-cs.BU.EDU (Barry Shein) writes: >1024 should be called a Kbit, there's just no choice, sorry, not funny. The plural of kbit is of course kibitz, as in "How much netnews came in last night?" "Lots of kibitz." :-) Snoopy tektronix!doghouse.gwd!snoopy snoopy@doghouse.gwd.tek.com "And it's a middle-endian machine with trinary logic." "They would do that!"
greywolf@unisoft.UUCP (The Grey Wolf @ ext 165) (07/22/87)
In article <6144@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <857@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >>A byte is therefore exactly 8 bits. No more and no less. Opinions to >>the contrary belong in the 1960s. Let them lie there and die there. > >And people who believe that 8 bits is sufficiently to encode a >character are either naive or stupid. And people who believe that there is a nice, neat way to implement an alter- nate solution throughout the computer industry are foolish, buttheaded and/or not very capable of mental throughput. [ or they own a Cray, where everything is a double (64 bits) anyway...] What is the problem here? I see nothing wrong eight bits for a character. Can you come up with anything better? What's the matter? Are escape sequences for special characters too much for you to handle? Gimme a break. Disgusted that this discussion is even *happening*, Roan Anderson ...sun! \ ...ucbvax!>unisoft! \ ...dual! / \ >greywolf ..sun!island!unicom! / ..ucbvax!well!unicom!/
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/23/87)
In article <463@unisoft.UUCP> greywolf@unisoft.UUCP (The Grey Wolf @ ext 165) writes: >I see nothing wrong [with] eight bits for a character. I take it you don't pay much attention to the rest of the world, then.
jhh@ihlpl.ATT.COM (Haller) (07/23/87)
In article <326@hubcap.UUCP>, beede@hubcap.UUCP (Mike Beede) writes: > Seriously--different machines serve different purposes, and so are designed > differently. That is why it is foolish to freeze some design parameter > arbitrarily. I don't see that there is, for instance, a clear argument > against 36 bit words and 9 bit bytes as opposed to 32 bit words and 8 bit > bytes, especially if your application works well with 9 bit quantities. The clear argument against 36 bit words and 9 bit bytes is data communications. Like it or not, data communications have virtually standardized the 8 bit byte. Just try to generate TCP/IP headers on a 9 bit machine, packing all of the data contiguously. Oh, you want to use a communications processor to do that? How many bits is its byte? Look at some of the higher level ISO protocols, and you will find that the basic unit of data is, surprise, surprise, an octet. Oh, sure, there is support for arbitrary bit strings, but even they are padded to octet boundaries. Back in the 60's, and possibly up to the mid 70's, when 7 track mag tape wasn't considered hopelessly obsolete, there was room for argument on what size provided the best 'byte'. However, the de facto standard is an 8 bit byte, which is becoming more and more institutionalized as time progresses. Given that a byte is an important measure, byte addressability becomes important in hardware architectures. Given that our machines operate with binary logic, word sizes are going to be powers of two bytes long, just so that byte addresses can be easily converted into word addresses, which is typically related by a power of two to the memory and bus architecture. Look at the Harris/6 if you want to see what kind of contortions were necessary to provide byte addressability with a 24 bit word size. In summary, I agree that while there was no good technical reason to have an eight bit byte originally, anyone designing a new computer that does not have an eight bit byte will be doomed to market failure. If Univac's 1100 series had taken off better than IBM's machines, I would probably be saying that six bit bytes are the wave of the future. That is not the case. John Haller
louie@sayshell.umd.edu (Louis A. Mamakos) (07/24/87)
In article <2425@ihlpl.ATT.COM> jhh@ihlpl.ATT.COM (Haller) writes: >In article <326@hubcap.UUCP>, beede@hubcap.UUCP (Mike Beede) writes: >> Seriously--different machines serve different purposes, and so are designed >> differently. That is why it is foolish to freeze some design parameter >> arbitrarily. I don't see that there is, for instance, a clear argument >> against 36 bit words and 9 bit bytes as opposed to 32 bit words and 8 bit >> bytes, especially if your application works well with 9 bit quantities. > >The clear argument against 36 bit words and 9 bit bytes is data >communications. Like it or not, data communications have virtually >standardized the 8 bit byte. Just try to generate TCP/IP headers >on a 9 bit machine, packing all of the data contiguously. Oh, >you want to use a communications processor to do that? How >many bits is its byte? Well, gee whiz; I've done just that. Having 9 bits/byte doesn't make this task anymore difficult that having byteswapped (little-endian) hardware. The 1100 communications hardware (serial communications lines, byte/block multiplexor channels, etc) just use the lower 8 bits of each byte to put on the wire. When the packet arrives, it do the equivilent of ntohl() and ntohs() on the appropriate header fields so that I can do arithmetic, etc on them. On a VAX, these operations byteswap, on the 1100 they simply squeeze the 9th bit out of each byte. Of course, in actual implmentation its not called ntohl() and its written in PLUS, not C. Also, having an 1100 makes calculating the 1's complement checksum easy, as its a ones-complement machine. Having 9 bits bytes comes in handy. You can leave "cookies" in a character string that are unique from any possible ASCII character value. >Given that a byte is an important measure, byte addressability >becomes important in hardware architectures. Given that >our machines operate with binary logic, word sizes are going >to be powers of two bytes long, just so that byte addresses >can be easily converted into word addresses, which is typically >related by a power of two to the memory and bus architecture. >Look at the Harris/6 if you want to see what kind of >contortions were necessary to provide byte addressability >with a 24 bit word size. > >In summary, I agree that while there was no good technical reason >to have an eight bit byte originally, anyone designing a new >computer that does not have an eight bit byte will be doomed >to market failure. If Univac's 1100 series had taken off better than >IBM's machines, I would probably be saying that six bit bytes >are the wave of the future. That is not the case. > Actually, it would be nice to have 9 bit bytes. Granted, there are many times that I wish I have byte addressability, but the PLUS compiler compiler can generate some (rather clever) code to do it for me. Actually, with PLUS I can have arrays of arbitrarly long entities, as someone in a previous message wanted. Another thing to consider is that on a machine like the 1100 with 36 bit words, you have more precision available both in the (single/double) integer and floating point formats. This apparantly matters to some folks around here. >John Haller BTW, if anyone is interesting in obtaining the TCP/IP package that was written for the 1100 at the University of Maryland, drop me some mail. Louis A. Mamakos WA3YMH Internet: louie@TRANTOR.UMD.EDU University of Maryland, Computer Science Center - Systems Programming
dg@wrs.UUCP (David Goodenough) (07/25/87)
In article <6144@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <857@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >>A byte is therefore exactly 8 bits. No more and no less. Opinions to >>the contrary belong in the 1960s. Let them lie there and die there. > >And people who believe that 8 bits is sufficiently to encode a >character are either naive or stupid. Well I've never yet had a problem communicating with any machine that uses ASCII (American *STANDARD* Code for Information Interchange), and it's my (possibly deluded :-) belief that there are a lot of machines out there that do like I do and use 8 bit bytes for holding characters. Let's see - there are Z80's (and maybe a couple of dozen other 8 bit micros), 8086 family, ns32000 family, pdp-11, vax, 68000 family, Z8000, amd2900 family, etc. etc. etc. Then we start looking at uarts and other communication devices- we have the Z80 DART/SIO, 8080 devices, the 6502 ACIA, plus the countless others that are not attached to any architecture. I don't know about the rest of the world, but it looks to me as if 8 bit chars are here to stay. (Just out of idle curiosity what size did you have in mind for a character, and WHY?) -- dg@wrs.UUCP - David Goodenough +---+ | +-+-+ +-+-+ | +---+
chris@mimsy.UUCP (Chris Torek) (07/26/87)
>In article <6144@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn) writes: >>And people who believe that 8 bits is sufficiently to encode a >>character are either naive or stupid. In article <274@wrs.UUCP> dg@wrs.UUCP (David Goodenough) replies: >Well I've never yet had a problem communicating with any machine that uses >ASCII (American *STANDARD* Code for Information Interchange), ... -------- >(Just out of idle curiosity what size did you have in mind for a character, >and WHY?) No wonder people get the idea that Americans are parochial. Americans *are* parochial! :-) How many languages do you speak---or rather, how many do you *write*? How many can you write while staying with 7-bit ASCII? ISO Latin-1 helps; the `extra' characters allow me to write in Deutsch (if I could) or Francois (look, there is one of those missing letters already) or Espanol (there goes another one), but does not do much for Hebrew (lost a bunch!) or Russian or (more troublesome) Japanese or Chinese. 16 bits seems to work for Japanese Kanji, but is, at least technically, not enough for Chinese (80,000+ symbols!). Moreover, there are people who want all of these simultaneously. I think 32 bits should suffice. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris
mats@forbrk.UUCP (Mats Wichmann) (07/28/87)
>> On a DEC-10/20, for example, a byte can reasonably be anything from >> 1 (0?) to 36 (35?) bits; 6, 7, and 9 bit bytes are all quite common... >Another example worth mentioning is the BBN C/70 and its kin, which have >10-bit bytes as I recall....There are also machines with 9-bit bytes, > although one seldom sees them in the Unix world. > >And then there's the PDP-8, where you get your choice of 12-bit bytes (ugh) >or 6-bit bytes (ARGH)... Old programmer #1: You think you had it tough? When I were learning to program, all I had were bits. I had to tie them together with string anytime I wanted to do something. Old Programmer #2: Bits? You had bits? You had it easy! All we had were..... ... ... Okay, guys, enough already. Please? -mats
karl@haddock.ISC.COM (Karl Heuer) (07/29/87)
In article <857@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes: >On modern machines, a byte is 8 bits. [Anything else is archaic.] ... >In his book "Reliable Data Structures in C", Thomas Plum gives portable >implementations of the memxxx functions (e.g. memset(), memcpy()). He >does not feel the need to point out that these are portable only if the >machine's word will hold exactly an integral number of chars. He doesn't need that restriction because the C language has already imposed it. But this has nothing to do with 8-bit bytes! On a 36-bit machine, a byte (in the C sense) *cannot* be 8 bits. If Plum's implementation is portable, it will still work on such a machine, with 9- or even 12- or 36-bit bytes. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
larry@kitty.UUCP (Larry Lippman) (07/29/87)
In article <358@forbrk.UUCP>, mats@forbrk.UUCP (Mats Wichmann) writes: > >And then there's the PDP-8, where you get your choice of 12-bit bytes (ugh) > >or 6-bit bytes (ARGH)... > > Old programmer #1: You think you had it tough? When I were learning to > program, all I had were bits. I had to tie them together with string > anytime I wanted to do something. > > Old Programmer #2: Bits? You had bits? You had it easy! All we had were..... operational amplifiers, 10-turn pots, patch cords, and a null meter! Sorry, but I had to finish this. Anyone remember ANALOG computers (especially those "personal" desktop versions made by EIA)? <> Larry Lippman @ Recognition Research Corp., Clarence, New York <> UUCP: {allegra|ames|boulder|decvax|rocksanne|watmath}!sunybcs!kitty!larry <> VOICE: 716/688-1231 {hplabs|ihnp4|mtune|seismo|utzoo}!/ <> FAX: 716/741-9635 {G1,G2,G3 modes} "Have you hugged your cat today?"
gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/30/87)
Re: memcpy() etc. It also is not yet settled whether the mem* functions are to handle (char)s or some other type of object representing bytes, e.g. (short char)s. At the moment the C language does not distinguish between a byte and a char, although it makes no presumption about the size of a char except that it must be at least 8 bits (it can be larger). The multiple-byte character issue has not yet been decided, unless it happened at the Paris X3J11 meeting in June which I had to miss.
steve@nuchat.UUCP (Steve Nuchia) (07/30/87)
In article <1879@ihlpe.ATT.COM>, daryl@ihlpe.ATT.COM (Daryl Monge) writes: > However, "bit" is useless in terms of user friendliness. Imagine: > -rwxr-x--- 1 daryl daryl 3102120bits Feb 6 22:40 gmacs > word (:-) "character", since at least to me that has some real world meaning. > ex: > /e31 (/dev/dsk/36bs2): 12632K characters 33572 unique files > > Comments? > Daryl Monge How bout: drwxrwxrwx 11 foo baz 10 entries frotz some_dir ---x--x--t 1 bin bin `size myprog` myprog (use your copious immagination) -rw-rw-rw- 2 joe dbase 47 Srecords dbase4 (S = Sagan = 1 billion billions) There are many metrics that have meaning. Which to use for a particular file depends on what you want to do with the file, which I should point out is not constant for a given file, and on a multiuser machine may even be more than one thing at a time. The guy trying to find a place to put the turkey wants to know how much medium it occupies while the owner wants to know how many words are in his file so he'll know when he can turn it in. Hey! its just a comment... he asked for it... it's his fault! Steve Nuchia
allbery@ncoast.UUCP (Brandon Allbery) (07/31/87)
As quoted from <274@wrs.UUCP> by dg@wrs.UUCP (David Goodenough): +--------------- | >And people who believe that 8 bits is sufficiently to encode a | >character are either naive or stupid. | | Well I've never yet had a problem communicating with any machine that uses | ASCII (American *STANDARD* Code for Information Interchange), and it's my >... | others that are not attached to any architecture. I don't know about the | rest of the world, but it looks to me as if 8 bit chars are here to stay. | (Just out of idle curiosity what size did you have in mind for a character, | and WHY?) +--------------- The key words are in here: *AMERICAN* Standard Code... and "I don't know about the rest of the world...". Kanji (for example) doesn't fit in 8 bits. Is the U.S. of A. the only country allowed to use computers? -- Brandon S. Allbery, moderator of comp.sources.misc and comp.binaries.ibm.pc {{harvard,mit-eddie}!necntc,well!hoptoad,sun!cwruecmp!hal}!ncoast!allbery ARPA: necntc!ncoast!allbery@harvard.harvard.edu Fido: 157/502 MCI: BALLBERY <<ncoast Public Access UNIX: +1 216 781 6201 24hrs. 300/1200/2400 baud>>
mark@ems.MN.ORG (Mark H. Colburn) (07/31/87)
In article <6156@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes: >In article <463@unisoft.UUCP> greywolf@unisoft.UUCP (The Grey Wolf @ ext 165) writes: >>I see nothing wrong [with] eight bits for a character. >I take it you don't pay much attention to the rest of the world, then. Often times I have seen a lot of flaming with absolutely no explanation as to why the original poster was wrong. This is one of those cases. Rather than say that an opinion is wrong, it would help to explain why it is wrong, so that the original poster (hopefully) learns by his mistakes. Doug is right of course. There is a need for more than eight bits for representing characters in other languages. The most glaring example is Kanji or Katakana, where there are literally 100,000+ letters in the alphabet. Obviously, it would be very difficult to express that in 8 bits :-). Other less obvious examples would be German, Norwegien, French and Greek. All of these languages, and others as well, make use of letters with special attributes. For example and e or u with an umlaut in German, a c with a circumflex (^), accent grave ('), or accent ague (`) in French, or the ae combination in Greek. Any of these characters are not in the standard ASCII 8-bit character set. Many of these are handled by extensions to ASCII or some other character set standard, however, 8-bits is not enough for some of the glyph-oriented alphabets. If you would like more information on this topic, there have been a number of good papers written and given at USENIX, as well as appearing in many of the trade journals. In addition, it is addressed in the proposed POSIX standard. -- Mark H. Colburn DOMAIN: mark@ems.MN.ORG EMS/McGraw-Hill UUCP: ihnp4!meccts!ems!mark AT&T: (612) 829-8200
henry@utzoo.UUCP (Henry Spencer) (08/05/87)
> operational amplifiers, 10-turn pots, patch cords, and a null meter! > > Sorry, but I had to finish this... Ah, but you didn't finish it. You forgot the tube tester! (Yes, Virginia, people did build op-amps out of vacuum tubes...) -- Support sustained spaceflight: fight | Henry Spencer @ U of Toronto Zoology the soi-disant "Planetary Society"! | {allegra,ihnp4,decvax,utai}!utzoo!henry