[comp.sys.m68k] M68020 and 4-byte alignment

achar@atari.UUCP (Alan Char) (02/23/89)

I am interested in experiences people have had converting a UNIX system
(operating system and utilities/applications) from 2-byte aligned to
4-byte aligned, specifically, when upgrading from a 68000 environment to
a 68020 environment.  Aside from the compiler itself, what kind of source
code changes (if any) were required?  What kind of performance improvement
(if any) resulted?  How long did it take?  What kind of gotchas did you run
into?  I am involved in a project that needs to estimate the costs/benefits
tradeoffs of doing this, and would appreciate any input on the subject I
can get.

I will summarize responses if people are interested after a couple of
weeks.  Thanks.

[I have cross-posted this to comp.sys.m68k, but have limited followups
to comp.unix.wizards.]
-- 
Alan Char				"Feel free to think." --Dave Clough
ucbvax!sun!atari!achar

achar@atari.UUCP (Alan Char) (03/14/89)

I sent a request about adjusting 68020 kernels to be 4-byte aligned.
Aside from a few people who incorrectly assumed I thought I NEEDED
to do this, the comments seemed to indicate that not much work was
required.  The original message and the responses I received are below.
Thanks to everyone that responded!  --Alan

-----

From achar Fri Feb 24 15:18:38 PST 1989
Article 15532 of comp.unix.wizards:
From: achar@atari.UUCP (Alan Char)
Newsgroups: comp.unix.wizards,comp.sys.m68k
Subject: M68020 and 4-byte alignment
Message-ID: <1361@atari.UUCP>
Date: 22 Feb 89 18:59:45 GMT
Organization: Atari (US) Corporation, Sunnyvale, California

I am interested in experiences people have had converting a UNIX system
(operating system and utilities/applications) from 2-byte aligned to
4-byte aligned, specifically, when upgrading from a 68000 environment to
a 68020 environment.  Aside from the compiler itself, what kind of source
code changes (if any) were required?  What kind of performance improvement
(if any) resulted?  How long did it take?  What kind of gotchas did you run
into?  I am involved in a project that needs to estimate the costs/benefits
tradeoffs of doing this, and would appreciate any input on the subject I
can get.


Date: Thu, 23 Feb 89 07:34:18 -0800
From: <sun!nsc.com!taux01!amos>
Organization: National Semiconductor (IC) Ltd, Israel Home of the 32532

I did part of the work porting Unix from the VAX to CCI's 6/32 machine
(a.k.a. Harris 7000 and Sperry something) which requires 4-byte alignments.

Most of the problems were in programs that did things like

	*(int *)p = i;

where p was a non-aligned char pointer.  BSD's assembler and linker were
of the most notorious offenders.  I also had to grep everything for <<8
and <<16.  Looking into each and every warning from lint was a great help
too.
-- 
	Amos Shapir				amos@nsc.com


Date: Fri, 24 Feb 89 23:22:20 EST
From: ames!ADAM.PIKA.MIT.EDU!ll-xn!scs (Steve Summit)

If you'd cross-posted to comp.lang.c, they'd have told you that, for properly-
written code, no source-level changes (to utilities, mind you) should
be rewuired, once you have the compiler working.  Improperly written
code, on the other hand, which is less infrequent than it ought to be,
is another story...
                                            Steve Summit


From: sun!uunet!microsoft!w-colinp
Date: Fri Feb 24 23:08:50 1989

Oh!  Okay.  I thought you were coming from farther out.  Generally, a little
rearrangement of structs is all you need to do.  Because you don't need to
do it all at once, or do it at all, the risks are pretty low.

Basically, an unaligned access doubles the memory-access time.  If it happens
inside an inner loop, it's worth working over.  If not, it might not be.

It really depends on the application, but you can probably just make a pass
through the .h files and catch 90% of the work.
-- 
	-Colin


Date: 27 Feb 89 09:17:25 CST (Mon)
From: sun!rpp386.dallas.tx.us!jfh (John F. Haugh II)
Organization: River Parishes Programming, Dallas TX

The operating system needs a number of changes not related
to the alignment.  The stack is the biggest issue.  Do keep
in mind that the alignment requirements for a 68020 are less
strict than for a 68000.  The 68020 can fetch operands on
any boundary whereas the 68000 must have word sized and
larger operands aligned on word boundaries.

The compiler should need very little work, aside from
adding the new operators.  I would suggest adding code
for efficiency sake to double-word align double-word
operands and longer, but it is only an efficiency issue.

I have ported PCC to the 68020 from the 68000.  It was
very easy, the compiler is very well written.  Well, as
far as porting goes it is very well written ...

- John.


Date: Fri, 3 Mar 89 15:39:17 pst
From: zardoz!felix!preston (Preston Bannister)
Organization: FileNet Corp., Costa Mesa, CA

In article <1361@atari.UUCP> you write:
>I am interested in experiences people have had converting a UNIX system
>(operating system and utilities/applications) from 2-byte aligned to
>4-byte aligned, specifically, when upgrading from a 68000 environment to
>a 68020 environment.  Aside from the compiler itself, what kind of source
>code changes (if any) were required?  What kind of performance improvement
>(if any) resulted?  How long did it take?  What kind of gotchas did you run
>into?  I am involved in a project that needs to estimate the costs/benefits
>tradeoffs of doing this, and would appreciate any input on the subject I
>can get.

We went through something like this about a year ago going from our
old 68010 box to our new 68020 box.  From the application point of
view, all we had to do is recompile (for the new code format).  (The
new code format has a larger page size, which with our MMU allowed
us to address more physical memory).  We have on the order of a
million lines of C code.
-- 
Preston L. Bannister


Date: 3 Mar 89 11:33:27 MST (Fri)
From: sun!sunburn!flakey.phx.mcd.mot.com!tom (Tom Armistead)
Organization: Motorola Microcomputer Division, Tempe, Az.

    I spent the last year or so, among other things, migrating to
an 88k environment (which is also 4-byte aligned).  For what it's worth,
here's a summary of what I found.

    Of course, as you note, you need a compiler that generates 4 byte
alignment.  Just merely recompiling the entire system takes care of 
almost everything, but not quite.

    Most of the tedious work involved in converting to the 4 byte 
alignment is in the device drivers.  The devices in the system, of 
course, generally can't be simply "recompiled" and so your stuck with
forcing the appropriate parts of the device driver structures to
effectively be aligned on 2-byte boundaries.   Basically to do this,
you have to examine all the drivers you are going to migrate and
look at all of the structures that the device "knows" about and
if any long or int element is at a 2 byte odd offset, split it
into 2 shorts.  The driver code that uses these structures must
then be modified to access the split up longs as two 16 bit values.

    Finding the structures that need changing would be easier
if you have a compiler that can be made to generate informational
messages about structure elements that don't fall on "naturally"
aligned boundaries*.  My compiler didn't have this capability
but fortunately I was already familiar with the drivers so I
was able to locate most of the structures needing tweaking
without much trouble.  

* This would also be useful for finding "software only" structures
  that waste memory due to a poor layout.

    Perhaps, if you don't have an existing base of devices you
have to support, you can require the hardware types to build
devices such that the device driver structures used to access
the device are naturally 4-byte aligned.  If you can do this, then you
won't have to futz around with making alignment type changes.

    Now, you need to ask yourself the question whether you have
to support old 2-byte aligned applications and filesystems as well.
If you do, you will have a lot more work to do.  All the "old"
applications will expect any structures passed from the kernel
to be 2-byte aligned and you will have to make allowance for 
those.  This would be a non-trivial task - requiring all
applications to be recompiled with the 4-byte alignment compiler
will save you a lot of trouble. 

    A macro or two can help with the areas of code that have to
be changed.  For example, a "SLONG" macro could be written as

typedef struct slong		/* split longs up into 16 bit halves */
{
	unsigned short hi;
	unsigned short lo;
} SLONG;

and used in the declaration of "int" or "long" entries that have to be
split up.  Likewise, macros can be written to set an SLONG to 
a 32 bit value or read a 32 bit value from an SLONG.  This helps
but the changes still have to be made manually.

    Without knowing what your drivers look like, I certainly 
couldn't even guess how long it would take you but it took me
about a month to make and debug the changes for a half dozen or
so drivers.  Since I was converting across architectures, I don't
have much idea about what performance increase was due to the
alignment changes - I would hazard a guess to expect in the
range of a few percent.

    Other than the above, you might check to make sure the stack
is always kept aligned to a 4 byte address (e.g. look at the low level
exception handling and how it switches/manipulates the SP).  



From: edler@cmcl2.NYU.EDU (Jan Edler)
Date: 27 Feb 89 19:07:39 GMT
Organization: New York University, Ultracomputer project

We did this, but for a 68010 system, not a 68020 one.  Our motivation
was peculiar to our hardware (experimental shared memory
multiprocessor) - we built hardware to make loads and stores of
properly aligned longs atomic.

Anyway, other than recompiling everything, I don't recall the change
being terribly difficult.  We had to be careful to make all the
necessary compiler changes (e.g.  make sure automatics are properly
aligned), but that wasn't too bad.  I don't remember too many instances
of user code that had to change; there may have been a couple things in
the library, but they weren't hard to find.  For example, the user-level
signal handling code and the kernel trap handling code have to be
careful to maintain stack alignment properly.

The most difficult aspect of it is probably the complete
incompatibility of the change.  In general, you can't trust binaries
from before the changeover.  Also, we use cross compilers a lot, and it
is annoying that we need a special cross compiler, we can't use just
any old 68k cross compiler (e.g. one that also works for 68k-based
workstations or single-board computers).

We never looked into the question of how much faster the machine runs
with proper alignment, but I suppose it could be noticeable.

Jan Edler
-- 
Alan Char				"Feel free to think." --Dave Clough
ucbvax!sun!atari!achar