[comp.unix.microport] --- GCC on an iAPX286

fortin@zap.UUCP (Denis Fortin) (04/11/88)

(I cross-posted to comp.lang.c since there are probably a number of people
 familiar with GCC there, but I am redirecting any follow-ups to
 comp.unix.microport since I guess a whole discussion on this might not
 be interesting to the comp.lang.c community at large :-)

Recently there has been some talk in comp.unix.microport about getting
GCC (the Gnu C Compiler) running under Microport's System V/AT on an
iAPX286.  Most of the comments were fairly negative (!) saying in
essence that it couldn't really be done because of the '286
architecture. 

Well, I wonder about that.  But first, let me explain why I'm interested
in GCC: I like Microport's System V/AT.  I've had it close to a year now
and apart from a couple of (somewhat annoying!) bugs, it has served me
pretty well.  I've been able to run a lot of software off the net onto
it (news, MicroEmacs, compress, sc, smail, etc) and since it *is* a
System V, I generally don't have too many problems running stuff that is
not Berkley-specific. 

However, there is a class of software that I've consistently had trouble
with: the programs that like to make the (quite valid) assumption that
machines these days use 32-bit pointers and allow the use of large
arrays.  Some sample programs in this class are pathalias, compress, Gnu
Emacs, etc. 

Now, I realize that the basic architecture of the '286 is segmented and
that the '286 prefers to deal with 16-bit ints rather than the 32-bit
kind, but I can't see the big problem into emulating 32-bit ints (we
already have "longs" don't we?) and a flat address space (requires a bit
of arithmetic on the pointers, but still, that's possible).

All that is required in order for software to believe that the '286 is
a 32-bit machine is a C compiler that makes the hardware appear that
way.  Of course there would be a performance penalty to pay, but when
the choices are to either run a large program somewhat slowly or not
run it at all, I don't really think it matters all that much!  Besides,
those trusty little old Z-80s and 6502s in the 8-bit micros have been
emulating 16-bit integers on 8-bit machines for years (:-).

Now, I'm not 100% familiar with the iAPX286 architecture and with GCC,
but I think it might be possible.  Does anybody care to comment?

(PS. If anybody knows of a high-quality C compiler that would work under 
     Microport System V/AT and give me both the HUGE model and 32-bit ints, 
     I might be willing to buy it, but I don't think that there exists one
     that would do exactly what I need!)
-- 
Denis Fortin                            | fortin@zap.UUCP
CAE Electronics Ltd                     | philabs!micomvax!zap!fortin
The opinions expressed above are my own | fortin%zap.uucp@uunet.uu.net

fox@alice.marlow.reuters.co.uk (Paul Fox) (04/20/88)

In article <435@zap.UUCP> fortin@zap.UUCP (Denis Fortin) writes:
>
>All that is required in order for software to believe that the '286 is
>a 32-bit machine is a C compiler that makes the hardware appear that
>way.  Of course there would be a performance penalty to pay, but when
>the choices are to either run a large program somewhat slowly or not
>run it at all, I don't really think it matters all that much!  Besides,
>those trusty little old Z-80s and 6502s in the 8-bit micros have been
>emulating 16-bit integers on 8-bit machines for years (:-).
>
Your basic premise is correct. The Large model 286 compilers attempt to
give the C programmer a virtual 32-bit environment. The problem that
crops up is that say we have an address: 0x4700:0xFFFF (seg = 47, offset =ffff)
there is no 'next' address. On a 32-bit machine the next address would be
0x4701:0x0000. On the 286 architecture, usually incrementing a 'far' 
pointer would produce: 0x4700:0x0000, since the high 16-bits are
separate from the lower 16-bits. The compiler could treat all 32-bit
addresses as if they were 'longs', and thus generate address '0x4701:0x0000'
but then the problem is that 0x4701:0x0000 is not a valid address. The
286 defines the segment address as 14-bits of segment number (8192 segments)
and two bits of 'protection level'. Thus the '0x4701' refers to invalid
address. The kernel would need to do something about this.

What one normally finds in large model programs is that segments are allocated
in the following sequence: 0x67, 0x6f, 0x77, 0x7f, 0x87, ... Thus
making the compiler 'correct' would not get over the problem of the underlying
o/s. 

Some of the compilers support a 'huge' model abstraction which attempts
to hide the 'funnies' of the acrhitecture. However, not only do 
they have sever limitiations (data structures are limited to powers of 2 
if > 128K , etc), but if you look at the code generated, you will see something
like 30 instructions generated for a single *ptr++ type operation. Performance
is real-real bad.


=====================
     //        o      All opinions are my own.
   (O)        ( )     The powers that be ...
  /    \_____( )
 o  \         |
    /\____\__/      
  _/_/   _/_/         UUCP:     fox@alice.marlow.reuters.co.uk