[net.arch] byte alignment

pdbain@wateng.UUCP (Peter Bain) (04/30/84)

The IBM STRETCH (7030) ca 1957, was bit addressable. The instruction
contained a word address (64 bit word) , bit offset, and a "byte length",
or operand size. The machine was a masterpiece of baroque architecture:
all sorts of nifty things that were practical to use only in assembler.
The term "byte" was coined for the STRETCH, and originally ment a variable 
size operand.
		-peter

rpw3@fortune.UUCP (05/04/84)

#R:wateng:-96500:fortune:16500013:000:2796
fortune!rpw3    May  4 01:35:00 1984

Speaking of "Stretch" and "bytes", don't forget the venerable PDP-10 and
DECsystem-20 have the same sort of "bytes" -- 1 to 36 bits wide, specified in
the "byte pointer". (I suspect the idea was copied from the IBM predecessors,
since a PDP-10 looks a lot like a cleaned-up IBM 7094.)

The nice thing about PDP-10/20 byte pointers is that they can directly
implement character set mapping via "byte strips". If you partition
the ASCII set into (say) 3 sets (for example, letters/numbers, delimiters,
and "ignore"), you can build a byte pointer that points to a two-bit wide
strip in a 128-word array (or 129 or 256 or 257, depending on taste).
This is a generalization of the trick used in the 'stdio' array "ctype",
but "ctype" uses only single-bit strips. (See CACM, long ago, Jim Gimpell (sp?)
of Bell Labs, "Minimization of Spatially-Multiplexed Character Sets", which
talked about the bit-strips [NOT byte-strips!] used in the SNOBOL-IV lexical
analyzer.)

To make it clearer (?), suppose the character array is called CTYPE (why not?),
and the set we are interested is called LETDEL (6-char limit, on the -10),
and it occupies bits 31 and 32 (the -10 is a "big Endian" machine, so
bit 0 is MSB and bit 35 is LSB), and suppose we always have the character
to be tested in accumulater (register) "C" (programmer-defined symbol),
then the code to do a three-way branch based of which subset the character
belongs to is the following two instructions (I have used UNIX-like syntax
for non-"10" readers):

+--
| 	ldb	t1,letdel	; get two-bit field into register t1
| 	jrst	@.+1(t1)	; dispatch via jump-table
| 	.word	is_ign		;   C is in the "ignore" set, go to is_ign
| 	.word	is_let		;   C is in the "letters & numbers" set
| 	.word	is_del		;   C is in the "delimeters" set
| 	.word	panic		;   can never happen if we built "ctype" right!
| 
|letdel: point	2,ctype(C),32	; same as 32<<30 + 2<<24 + C<<18 + ctype
+--

Using such byte strips (which are only marginally less efficient even on
hardware without variable-sized bytes), VERY FAST lexical analyzers can be
built, fast enough that character-at-a-time interpretation of interactive
languages (such as FOCAL or *shudder* BASIC) is not unreasonable. In a port
of FOCAL to the PDP-10 which used this, the parsing was NOT the critical
time item (symbol table search was).

This is a case in which efficient HARDWARE architecture on an early machine
can suggest efficient SOFTWARE on later machines, even though they may not
have the special hardware of the earlier machine. For example, the "bit-blt"
hardware of the Xerox "Alto" and the software of the Apple "Mac".

Rob Warnock

UUCP:	{ihnp4,ucbvax!amd70,hpda,harpo,sri-unix,allegra}!fortune!rpw3
DDD:	(415)595-8444
USPS:	Fortune Systems Corp, 101 Twin Dolphin Drive, Redwood City, CA 94065

gwyn@brl-vgr.ARPA (Doug Gwyn ) (05/12/84)

Speaking of the etymology of the word "byte", this is a plea for
the correct spelling of "nybble".  thanks...

andrew@hwcs.UUCP (05/15/84)

	>The IBM STRETCH (7030) ca 1957, was bit addressable.
	>The instruction contained a word address (64 bit word) ,
	>bit offset, and a "byte length", or operand size.
	>The machine was a masterpiece of baroque architecture.

Does my memory deceive me, or was this beastie so parallel and so unreliable
as a result that it was renamed "TWANG"?

		- andrew stewart (...!ukc!edcaad!hwcs!andrew)

graham@convex.UUCP (05/15/84)

#R:wateng:-96500:convex:32800006:000:220
convex!graham    May 15 13:27:00 1984

The bit addressable fixed point on STRETCH was so slow that it was customary for
programmers to do address calculations in floating point!!

Marv Graham; ConVex Computer Corp. {allegra,ihnp4,uiucdcs,ctvax}!convex!graham

klr@lems.UUCP (Ken Robbins) (10/17/84)

I am new to the net and I seem to have got the tail end of these discussions
on byte alignment.  For those of you interested in arbitrary byte alignment
as well as variable length data (as opposed to just byte, word, long
etc.) you should probably take a look at INTEL's iAPX series of microprocessors.

It's been over a year since I have looked at the architecture of this chip,
and I don't seem to have the data sheets handy, so take this with a grain of
salt.  The iAPX grabs its data in nibbles, and is designed to gobble up strings
(of arbitrary length) of nibles to form data words that are multiples of
nibbles.  Therefore, this chip (actually chip set) is is not constrained by
byte alignment or data width.
  There are quite a bit of progressive architecture features of the iAPX series,
I will try to get a hold of the data sheets and summarize some of the salient
features. If anyone else on the net knows about, or especially has used this
series, it would be great if they could post their experiences and knowledge.
      Ken

alan@drivax.UUCP (Alan Fargusson) (10/29/84)

You must be thinking of the 432. The iAPX286 is a 16 bit machine, and
if you access a word that is not aligned it does two byte accesses. This
causes a performance penelty that seems rather high by Intels own benchmarks.
I think you will find that the other iAPX*86 products are much the same in this
respect.

While I am on the subject, it seems to me that the real problem is in 
the implementation. If you design hardware to handle byte alignment it
should be just as fast to access missaligned words as aligned words. This
would be more expensive of course, but if you need to do it in software it
is much slower.

An aside: I wrote a device driver for the Intel iSBC 215 (for UN*X sVr2).
This device uses ram for control blocks. Two of the fields in the error
status block are missaligned, and one in the format block is missaligned.
Our compiler alignes words, so I had to declare these fields as chars, then
make an int pointer to the first char to access the words. It is kind of a
pain, and the code is less clear than I would like. If hardware designers
don't want to make efficiant access to missalligned words then they shouldn't
missallign words in hardware! (is this a flame?).
-- 
---------------------
Alan Fargusson.

{ ihnp4, sftig, amdahl, ucscc, ucbvax!unisoft }!drivax!alan