[comp.lang.forth] Align

wmb@MITCH.ENG.SUN.COM (01/29/91)

> >ALIGN etc
>   Again I have no strong feeling here, so I give my default advice,  although
>weakly: take it out of the standard if possible; at least take  it out of the
>core.

There are machines in the world besides PC's.

Some of use those machines.

Many of those machines have hardware alignment restrictions.

ANS Forth intends to allow you to write portable programs, not just
PC programs.

If ALIGN is optional, I can't write a portable program, unless all
Forth data access operators are guaranteed to work on any alignment.
The effect of such a guarantee would be to at least double the execution
time of words like @ and ! on many processors.

If your machine doesn't need ALIGN, either define it as an immediate no-op,
(at a memory cost of 12 bytes), or supply it in source form (at a memory cost
of 0 bytes).  The CORE wordset is not the set of words that you must have
present in the dictionary at all times; instead, it the set of words that
you are expected to either have present, OR to show the user how to implement
on your system by providing source code for that word.

I have tons of portable source code that works in quite a few different
environments, with different alignment restrictions.  The word ALIGN appears
all over the place.

The "I don't need it in my environment, so you can't have it" attitude
is pretty parochial, in my (not at all humble) opinion.

Mitch Bradley, wmb@Eng.Sun.COM

UNBCIC@BRFAPESP.BITNET (02/01/91)

> There is no need to use ALIGN before each and every @ or ! as you seem to be
> saying.  @ and ! *must* have address aligned addresses, just as C@ and C!
> *must* have character aligned addresses.  The use of ALIGN is simply when
> moving from a character to an address aligned address.  When all is said
> and done, this does not happen that often.

No need of ALIGN before @ or !, that's right. In almost any case, anyway. BUT,
when you (to use FIG-Forth vocabulary, and I think 79S too), when you <BUILDS
things, you need to ALIGN it. And, if the word DOES> nothing, the user will
have to use ALIGN before @ and ! too.

> Peter Knaggs,

                              (8-DCS)
Daniel C. Sobral
UNBCIC@BRFAPESP.BITNET

UNBCIC@BRFAPESP.BITNET (02/15/91)

> > ...when you <BUILDS things, you need to align it.  And, if the word
> > DOES> nothing, the user will have to use ALIGN before @ and ! too.

>Actually, that's not true, if the system implementor did things right.  The
>last word-aligned system I used automatically ALIGNed before every CREATE.
>This forced the parameter field to an even address (which was required for the
>thread of a colon definition). So, DOES> always returned an aligned address,
>and the user didn't have to worry about it.

>Strings compiled in-line were always padded to an even number of bytes; this
>required a small bit of additional logic in the run time code which advances
>the IP over the string, but it was invisible to the user.  (In-line byte
>parameters were forbidden, no great loss.)

1) I think the loss of the ability to compile bytes is a great loss.
2) How about
: DATA CREATE ALLOT ( NAME ) , ( AGE ) ;
15 30 DATA NAME_1

Just putting 15 won't work. SPARCs have 4-bytes alignement restriction too, for
example. And on and on. And RECORD structures ARE VERY USEFUL.

                              (8-DCS)
Daniel C. Sobral
UNBCIC@BRFAPESP.BITNET

UNBCIC@BRFAPESP.BITNET (06/19/91)

Sorry for the garbage... Anyway,

=> Date: Mon, 17 Jun 91 19:37:47 GMT
=> From: Rob Sciuk
=>  <news-server.csri.toronto.edu!torsqnt!geac!maccs!innovus!rob@UUNET.UU.NET>
=> Subject: RE: Memory Management/PIC
=> Elizabeth points out that any standard defining word should take care
=> to align words, (bodies, headers, and fields contained therein) on
=> appropriate boundaries.  Further, `ALLOT' and `,' should align on
=> CELL boundaries, and `C,' should ensure that the next invocation of
=> `HERE', `ALLOT', `,' etc. will utilize a CELL boundary appropriate
=> to the processor [mine].

C, should ensure that the next invocation of HERE, ALLOT... will utilize a CELL
boundary?!?!?!?!??!?!?!? It's better live with a slow @ and ! than with this!
We have only two options: 1) Throw an overhead upon HERE, ALLOT...; 2) Make C,
ALLOT a CELL, thus actings a comma.

Another thing, if ALLOT and HERE return always an aligned address, it's better
make this very clear in the standard, or Structure Wordsets (wich are very
commom) will be source of lots of errors. I wouldn't like an ALLOT that
aligns, but, then, you can never satisfy everyone.


                              (8-DCS)
Daniel C. Sobral                           Errare Humanum Est...
UNBCIC@BRFAPESP.BITNET                     ...Perseverare Autem Diabolicum
UNBCIC@FPSP.FAPESP.ANSP.BR
--------------------------------------------------------------------------
No one, but me, is responsible for the above message.

nick@sw.stratus.com (Nicolas Tamburri) (06/19/91)

Daniel C. Sobral write:
>C, should ensure that the next invocation of HERE, ALLOT... will utilize a CELL
>boundary?!?!?!?!??!?!?!?

Good.  It wasn't just me who thought this was a lousy idea.  I was wondering
how C, would ever accomplish this,  short of always allocating enough
bytes to end up on a CELL boundary.  But then how do you pack bytes with
successive "C,"s (sp?).

I'm always hesitant of posting to this group,  having read publications
by many of the other posters, it is hard for me to think of myself as a
peer.  For example,  I assume there must be something I don't understand
about all these ALIGNment issues.  Haven't we been living with ALIGN on
68Ks for a decade now?  I've always assumed that the implementation was
pretty straight forward:  ALLOT assures that the address generated for the
variable being alloted is appropriate to the size of the variable, allocating
extra bytes to make it so.  Of course, this assumes the size is a 'natural'
size for the processor, usually bytes, longs etc.  For 'unnatural' records,
you had to align things manually.  Is there something new I'm missing?

BTW:  Alignment to a CELL boundary is not necessarily sufficient, depending
on the processor.  For example, the i860 requires address alignment to be
MOD(size of variable), or there is a very high performance penalty on memory
accesses.  

rob@innovus.uucp (Rob Sciuk) (06/19/91)

In article <9106190432.AA02430@ucbvax.Berkeley.EDU> UNBCIC%BRFAPESP.BITNET@SCFVM.GSFC.NASA.GOV writes:
>Sorry for the garbage... Anyway,
>
>=> Date: Mon, 17 Jun 91 19:37:47 GMT
>=> From: Rob Sciuk
>=>  <news-server.csri.toronto.edu!torsqnt!geac!maccs!innovus!rob@UUNET.UU.NET>
>=> Subject: RE: Memory Management/PIC
>=> Elizabeth points out that any standard defining word should take care
>=> to align words, (bodies, headers, and fields contained therein) on
>=> appropriate boundaries.  Further, `ALLOT' and `,' should align on
>=> CELL boundaries, and `C,' should ensure that the next invocation of
>=> `HERE', `ALLOT', `,' etc. will utilize a CELL boundary appropriate
>=> to the processor [mine].
>
>C, should ensure that the next invocation of HERE, ALLOT... will utilize a CELL
>boundary?!?!?!?!??!?!?!? It's better live with a slow @ and ! than with this!
>We have only two options: 1) Throw an overhead upon HERE, ALLOT...; 2) Make C,
>ALLOT a CELL, thus actings a comma.

I think you misunderstood my point ... C, in my example works entirely as
advertised, and allots in 1 byte increments HOWEVER, if a C, leaves HERE at an
unaligned address, the HERE pointer is advanced to the next aligned address.
The next invocation of ALLOT, HERE etc, will return the padded and aligned 
address, but the next C, will assign BEHIND HERE giving a CHERE :-)if you will.
I feel no need to actually provide CHERE, and entirely hide the semantic detail
in the implementation of C,.

	---> Direction of Growth
	{[*][*][ ][ ]}{[ ][ ][ ][ ]}{[ ][ ][ ][ ]}{[ ][ ][ ][ ]}
		^       ^
		CHERE   HERE

>
>Another thing, if ALLOT and HERE return always an aligned address, it's better
>make this very clear in the standard, or Structure Wordsets (wich are very

Agreed. This should occur!  I implore the standards committee to do so!
(in fact, I may make written application in the near term)

>commom) will be source of lots of errors. I wouldn't like an ALLOT that
>aligns, but, then, you can never satisfy everyone.

In my example, ALLOT doesn't align, `C,' does.

Alignment is not just a good idea on most RISC processors, it is the law!
The cost of alignment on byte oriented machines is, IMHO, negligible.
Further, inclusion of the alignment semantics within the standard would aid 
portability immeasurably.  Granted, this may be unacceptable in some embedded
systems(?), but this can be overcome with words such as CALLOT, and CHERE with 
the obvious implied semantics (remember this is Forth we are dealing with!).

By the way, this is something that I had to retrofit into my implementation
because I was dumping core on Unix systems with what I believed to be standard
(Forth-83) programs.  My implementations will continue to behave this way 
whether the standard includes this or not because I have been burned.
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Sciuk	rob@innovus.com OR rob@innovus.on.ca
Innovus Inc.	204-200 James St S. Hamilton, Ont. 	Phone:	(416) 529-8117 
		{not a flame ... merely a glimmer ...}	Fax:	(416) 572-9586	

UNBCIC@BRFAPESP.BITNET (06/19/91)

=> Date: Wed, 19 Jun 91 14:10:18 GMT
=> From: Nicolas Tamburri
=>  <att!bu.edu!transfer!lectroid!sw.stratus.com!nick@UCBVAX.BERKELEY.EDU>
=> Subject: RE: Align

=> I'm always hesitant of posting to this group,  having read publications
=> by many of the other posters, it is hard for me to think of myself as a
=> peer.  For example,  I assume there must be something I don't understand

NEVER think that way. I think that almost everyone on this group wants to read
messages from everyone on this group.

=> BTW:  Alignment to a CELL boundary is not necessarily sufficient, depending
=> on the processor.  For example, the i860 requires address alignment to be
=> MOD(size of variable), or there is a very high performance penalty on memory
=> accesses.

Yes and no. In the SPARCstation, for example, you also need 4 bytes alignment
when reading 32 bits (and 2 bytes when reading 16 bits, but the Forths on
SPARCs are usually 32 bits). *BUT*, in Forth you can only read a memory
position with @ and C@. So, you only need characters and cells.


                              (8-DCS)
Daniel C. Sobral                           Errare Humanum Est...
UNBCIC@BRFAPESP.BITNET                     ...Perseverare Autem Diabolicum
UNBCIC@FPSP.FAPESP.ANSP.BR
--------------------------------------------------------------------------
No one, but me, is responsible for the above message.

rob@innovus.uucp (Rob Sciuk) (06/20/91)

In article <6217@lectroid.sw.stratus.com> nick@sw.stratus.com (Nicolas Tamburri) writes:
>Daniel C. Sobral write:
>>C, should ensure that the next invocation of HERE, ALLOT... will utilize a CELL
>>boundary?!?!?!?!??!?!?!?
>
>Good.  It wasn't just me who thought this was a lousy idea.  I was wondering
>how C, would ever accomplish this,  short of always allocating enough
>bytes to end up on a CELL boundary.  But then how do you pack bytes with
>successive "C,"s (sp?).

here's how!

	hmmm ... here we go again ... my C, will not pad the dictionary
	per se, there is an implicit ALIGN and ALLOT done if necessary

	My prior posting included a C implementation of C, but the 
	following code is provided to clarify (?) the semantics of my
	proposed C,. 

	quan CDP
	: C,	( c --- )
		CDP NOT  ( CDP not initialized or ... )
		CDP HERE = OR ( CDP is at HERE or ...)
		DP HERE - CELLSIZE 1- > OR IF ( CDP is more than 1 
						CELL behind HERE )
			HERE is CDP  	( re-align CDP )
			1 ALLOT  	( push HERE up to the next CELL )
		THEN
		CDP !			( store the character at CDP )
		CDP 1+ is CDP		( increment CDP )
	;

	This code will sucessively store bytes contiguously, unless an
	ALLOT intervenes.  HERE will ALWAYS point to the next ALIGNED
	address regardless of the number of invocations of C,

	The ONLY time padding will occur, is if C, is followed by ALLOT,
	CREATE, or a : def'n ... otherwise, CDP points to the next available
	byte in the dictionary, and HERE points to the next available CELL
	greater than CDP .... when padding does occur, it will cost at most
	three bytes on a 32 bit machine (1 byte on a 16 bit machine).
>
>I'm always hesitant of posting to this group,  having read publications
>by many of the other posters, it is hard for me to think of myself as a
>peer.  For example,  I assume there must be something I don't understand

	don't be ... your interest makes you a peer!

>about all these ALIGNment issues.  Haven't we been living with ALIGN on
>68Ks for a decade now?  I've always assumed that the implementation was
>pretty straight forward:  ALLOT assures that the address generated for the
>variable being alloted is appropriate to the size of the variable, allocating

	The problem lies not with processors like the MC68000 family, 
	which allow byte aligment (as do Intel) but the MC88000 and other 
	RISC processors which REQUIRE word alignment (HP-PA, SPARC etc).
	This means code will CORE a Unix process running on such a RISC
	machine.

rss.
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Rob Sciuk	rob@innovus.com OR rob@innovus.on.ca
Innovus Inc.	204-200 James St S. Hamilton, Ont. 	Phone:	(416) 529-8117 
		{not a flame ... merely a glimmer ...}	Fax:	(416) 572-9586	

UNBCIC@BRFAPESP.BITNET (06/20/91)

=> I think you misunderstood my point ... C, in my example works entirely as
=> advertised, and allots in 1 byte increments HOWEVER, if a C, leaves HERE at a
   n
=> unaligned address, the HERE pointer is advanced to the next aligned address.
=> The next invocation of ALLOT, HERE etc, will return the padded and aligned
=> address, but the next C, will assign BEHIND HERE giving a CHERE :-)if you wil
   l.
=> I feel no need to actually provide CHERE, and entirely hide the semantic deta
   il
=> in the implementation of C,.
=>
=>      ---> Direction of Growth
=>      {[*][*][ ][ ]}{[ ][ ][ ][ ]}{[ ][ ][ ][ ]}{[ ][ ][ ][ ]}
=>              ^       ^
=>              CHERE   HERE

No, I didn't misunderstood your point. That's EXACTLY what I dislike. As your
implementation have two pointers instead of one, the overhead is not on ALLOT
(etc), but on C,. It doesn't make things better, anyway.

=> >commom) will be source of lots of errors. I wouldn't like an ALLOT that
=> >aligns, but, then, you can never satisfy everyone.
=>
=> In my example, ALLOT doesn't align, `C,' does.

You said that ALLOT always return an aligned address. What about CREATE 15
ALLOT ( name ) 0 , ( age ) ?

=> Alignment is not just a good idea on most RISC processors, it is the law!

For sure.

=> The cost of alignment on byte oriented machines is, IMHO, negligible.
=> Further, inclusion of the alignment semantics within the standard would aid
=> portability immeasurably.  Granted, this may be unacceptable in some embedded
=> systems(?), but this can be overcome with words such as CALLOT, and CHERE wit
   h
=> the obvious implied semantics (remember this is Forth we are dealing with!).

Leave things working without automatic alignment and it'll be just fine.
Anyway, if the embedded system requires as much space as possible, the
ALIGNMENT will probably be a byte.

DISCLAIMER: It doesn't matter if the current dpANS Forth have this. Although I
don't like it, it's just not enough to make me dislike ANS Forth.


                              (8-DCS)
Daniel C. Sobral                           Errare Humanum Est...
UNBCIC@BRFAPESP.BITNET                     ...Perseverare Autem Diabolicum
UNBCIC@FPSP.FAPESP.ANSP.BR
--------------------------------------------------------------------------
No one, but me, is responsible for the above message.

UNBCIC@BRFAPESP.BITNET (06/21/91)

=>      My prior posting included a C implementation of C, but the
=>      following code is provided to clarify (?) the semantics of my
=>      proposed C,.

I'm not even considering the speed, but how much complexity you are throwing
over a simple word!

=>
=>      quan CDP
=>      : C,    ( c --- )
=>              CDP NOT  ( CDP not initialized or ... )
=>              CDP HERE = OR ( CDP is at HERE or ...)
=>              DP HERE - CELLSIZE 1- > OR IF ( CDP is more than 1
=>                                              CELL behind HERE )
=>                      HERE is CDP     ( re-align CDP )
=>                      1 ALLOT         ( push HERE up to the next CELL )
=>              THEN
=>              CDP !                   ( store the character at CDP )
=>              CDP 1+ is CDP           ( increment CDP )
=>      ;

                              (8-DCS)
Daniel C. Sobral                           Errare Humanum Est...
UNBCIC@BRFAPESP.BITNET                     ...Perseverare Autem Diabolicum
UNBCIC@FPSP.FAPESP.ANSP.BR
--------------------------------------------------------------------------
No one, but me, is responsible for the above message.

nick@sw.stratus.com (Nicolas Tamburri) (06/22/91)

> rob@innovus.uucp (Rob Sciuk) writes:
>	The problem lies not with processors like the MC68000 family, 
>	which allow byte aligment (as do Intel) but the MC88000 and other 
>	RISC processors which REQUIRE word alignment (HP-PA, SPARC etc).

As far as I know, the 68000/6801x processors will fault if you try to execute a word mode instruction with a byte address.  However this "feature" was
corrected in the 68020 and after processor although with a penalty in access time, (which is probably what you were thinking of.)  I think my statement
stands: Necessary alignment is nothing new.

In any case:  I guess I see how your method works,  but I have a dissagreement
with it philosophically.  I don't mind CREATE aligning things behind my back,
if it is clearly documented that way.  I do mind ALLOT doing it however, since
this is a more general purpose word.  With your method, how do I create a
structure that looks like this:

[1 byte ( length )][15 bytes ( name )]

0 C, 15 ALLOT

without alloting an extra byte before the name?  What's the benefit of using C,
instead of , in this case?

I guess I would prefer to put the implict alignment inside words like "," "W,"
"L," etc, where it is obvious that if I use these to define a data structure, then
I will know enough to use the correct mode to access the data.  If I use ALLOT
however, I am implying that I want to access the data in unspecified, possibly
multiple ways.  If I do want the 15 bytes aligned,  I will align them manually,
which ensures my awareness that the bytes are not packed tightly.

In any case, the semantics are a matter of option.  We all do things our way.
My only problem with your method is if it restricts my ability to pack bytes.

							/nt

nick@kyron.sw.stratus.com

dcp@world.std.com (David C. Petty) (06/26/91)

In article <9106190432.AA02430@ucbvax.Berkeley.EDU>,
UNBCIC%BRFAPESP.BITNET@SCFVM.GSFC.NASA.GOV writes:

'=>Further, `ALLOT' and `,' should align on
`=> CELL boundaries, and `C,' should ensure that the next invocation of
`=> `HERE', `ALLOT', `,' etc. will utilize a CELL boundary appropriate
`=> to the processor [mine].
`
`C, should ensure that the next invocation of HERE, ALLOT...will utilize a CELL
`boundary?!?!?!?!??!?!?!? It's better live with a slow @ and ! than with this!
`We have only two options: 1) Throw an overhead upon HERE, ALLOT...; 2) Make C,
`ALLOT a CELL, thus actings a comma.

There is a third option.  Put the onus on the _programmer_ to put
ALIGN after the appropriate ALLOT / C, when allocating a data
structure in the dictionary if it is possible that a partial cell has
been allotted.  Then use CELL+ and CHAR+ and ALIGNED when ``stepping
through'' the data structure.  That is the approach taken by ANS
Forth.  

I should also add that most of the alignment calculations can be done
at compile time as in John Hayes' structure implementation (_Forth
Dimensions_ vXI #6).  

-- 
 David C. Petty | dcp@world.std.com | ...!{uunet,bu.edu}!world!dcp /\
      POBox Two | CIS: 73607,1646   | BIX, MCIMail: dcp           /  \
 Cambridge,  MA | `Whatsoever thou doest to the tip,             /    \
02140-0001  USA |  doest thou likewise to the ring.' - RAG      /______\

UNBCIC@BRFAPESP.BITNET (06/28/91)

=> In article <9106190432.AA02430@ucbvax.Berkeley.EDU>,
=> UNBCIC%BRFAPESP.BITNET@SCFVM.GSFC.NASA.GOV writes:
=>
=> '=>Further, `ALLOT' and `,' should align on
=> `=> CELL boundaries, and `C,' should ensure that the next invocation of
=> `=> `HERE', `ALLOT', `,' etc. will utilize a CELL boundary appropriate
=> `=> to the processor [mine].
=> `
=> `C, should ensure that the next invocation of HERE, ALLOT...will utilize a CE
   LL
=> `boundary?!?!?!?!??!?!?!? It's better live with a slow @ and ! than with this
   !
=> `We have only two options: 1) Throw an overhead upon HERE, ALLOT...; 2) Make
   C,
=> `ALLOT a CELL, thus actings a comma.

I was talking about options when you make C, ensure ....

=> There is a third option.  Put the onus on the _programmer_ to put
=> ALIGN after the appropriate ALLOT / C, when allocating a data
=> structure in the dictionary if it is possible that a partial cell has
=> been allotted.  Then use CELL+ and CHAR+ and ALIGNED when ``stepping
=> through'' the data structure.  That is the approach taken by ANS
=> Forth.

That's the way I like it. If someone wants a 83-Standard @ in a ANS-Forth just
type:
: @ DUP C@ SWAP 1+ C@ ( shift or multiply ) + ;
: ! 2DUP C! SWAP ( shift or divide ) SWAP 1+ C! ;

About C! a TRUE flag: I think that ANS Forth have adopted as TRUE all bits 1,
so C! should work.

                              (8-DCS)
Daniel C. Sobral                           Errare Humanum Est...
UNBCIC@BRFAPESP.BITNET                     ...Perseverare Autem Diabolicum
UNBCIC@FPSP.FAPESP.ANSP.BR
--------------------------------------------------------------------------
No one, but me, is responsible for the above message.