[comp.sys.att] Could a 68881 Math co-processor go fast?; HwNote07

jbm@uncle.UUCP (John B. Milton) (10/24/88)

Parts of this article are copied from mail to David Wexelblat
(dwex@mtgzz.att.com), who suggested a 68881 when I asked for hw ideas. The
rest of you just don't seem to have any ideas :-!

I was thinking about that one a while ago. AT&T actually made a 68881 board,
they even have a part number, 105160253. I talked to some guy at AT&T, and
he says the board was never release due to poor performance. Since the 68010
does not have a coprocessor interface, the 68881 would have to operate as an
I/O device, so it would work fine as an expansion card. David tells me that
the software interface was implemented using an F-line exception handler. While
this is clean from the compiler point of view (you just generate 68881
instructions), it is expensive:
1. Execute instruction
2. Get F-line exception
3. Stack everything away
4. Start running the common kernel exception code
5. Pull all the parameters from the user PC and talk to the 68881 through I/O
6. Defer to another process while the instruction runs on the 68881
7. Get an interrupt, get the status
8. Call the kernel code to return from exception.

Steps 6&7 could be replaced with
6. Loop polling the 68881 for completion
7. Get the status

I don't know what they did about having an expanded process context for procs
using the 68881.

It seems the only way to get decent speed is to have the program using the
68881 be the only one using it and to access it directly. David's idea is to
modify GNU C to generate the proper I/O directly. The only code that could
directly access I/O would be kernel code, like a loadable driver. It wouldn't
be so fun to write all your floating point programs as loadable drivers. Hmm,
you would have unlimited CPU... Nah

On the other hand, there is a messy solution. The user only has access to the
user memory mapped to his process. There is a routine plock(2) which might be
used to lock a page so that the physical address of the page would never
change and would always be there. I have never tried to use this on the UNIXpc,
I just checked to make sure it's in the manual. The user address could be passed
to a /dev/fpa driver via ioctl(2). The driver coulld take this address and
find the physical address corresponding to it. This address could be passed
from the driver to some I/O which would map the 68881 in every time there is
an access to this page. When the file to this driver is closed, the mapping
is disabled. The fpa device could only be opened once. The program would have
to run suid to call plock(), but it could do a setuid(getuid()) once it's
running, which would ba a reasonable compromise.

The hardware could not be on the expansion bus for this idea, because you
could not be sure that a given user page is on an expansion memory board.
There could also be nasty problems if the page is used for anything else
while mapped to the 68881. Some good news is that the 68881 could be run
asyncronously, so it could have it's own clock. So yes, it could be a 33MHz
68882 (if they're that fast yet?). Yes, it would cost more than the machine.
I don't know what the current price is for the 12.5 MHz 68881. Assuming I
could whip together something that would work. How many people would be inter-
ested in buying a board without the 6888[12]? Please e-mail.

Ok, tell me how much of this I got wrong. Also send me any other ideas. Since
this would all be done in I/O, I suppose it could be ANY over priced FPA. I
know what you're thinking now, how about a DSP instead? Sure, I think all this
would apply.

John
-- 
John Bly Milton IV, jbm@uncle.UUCP, n8emr!uncle!jbm@osu-cis.cis.ohio-state.edu
home (614) 294-4823, work (614) 764-4272;  Send vi tricks, I'm making a manual

ditto@cbmvax.UUCP (Michael "Ford" Ditto) (10/25/88)

I'd just put in a 68020 at the same time; then you get the coprocessor
interface and a few other benefits.

Computer System Associates in San Diego makes "daughter board" 68020/881
upgrades that plug into a 68010 socket.  They come in several different
shapes, so maybe one would fit in the Unix PC.  There might be some minor
OS problems.  (And, of course, minor OS problems become major ones without
source code, but I think it would be doable.)
-- 
					-=] Ford [=-

"The number of Unix installations	(In Real Life:  Mike Ditto)
has grown to 10, with more expected."	ford@kenobi.cts.com
- The Unix Programmer's Manual,		...!sdcsvax!crash!elgar!ford
  2nd Edition, June, 1972.		ditto@cbmvax.commodore.com

tkacik@rphroy.UUCP (Tom Tkacik) (10/25/88)

In article <5087@cbmvax.UUCP> ditto@cbmvax.UUCP (Michael "Ford" Ditto) writes:
>I'd just put in a 68020 at the same time; then you get the coprocessor
>interface and a few other benefits.

Sounds like a good idea, but how do we do it?
>
>Computer System Associates in San Diego makes "daughter board" 68020/881
>upgrades that plug into a 68010 socket.  They come in several different
>shapes, so maybe one would fit in the Unix PC.  There might be some minor
>OS problems.  (And, of course, minor OS problems become major ones without
>source code, but I think it would be doable.)
>-- 
>					-=] Ford [=-
>"The number of Unix installations	(In Real Life:  Mike Ditto)

I think the hardware problem is trivial, as you mentioned, there are
companies who make the needed daughter board.

Unfortunately, without source code, I do not think the problems with the OS
are minor, they are major.  The size of the saved state of a 68020 and 68881
is much larger than that for the 68010.  Each time there is a context switch
the OS thinks it is putting away info for the 68010.  We might be able to
find out where in the OS this is done, and make a slight change to tell it
to store away the extra information, but where would it go.  I do not think
we would be able to increase the structure size used to save this data.

Then again, as Lenny T. pointed out, the compiler has the switches for
generating code for the 68020-68881 combination.  Maybe this was to be
a possible upgrade, and was considered when the OS was designed?  Perhaps
the hooks are all there, only waiting to be found.  Any ideas about where
to start looking?

---
Tom Tkacik
GM Research Labs
Warren MI
{umix, uunet!edsews}!rphroy!megatron!tkacik
{umix, uunet!edsews}!rphroy!tetnix!tet

thad@cup.portal.com (Thad P Floryan) (10/26/88)

Mike Ditto suggested trying one of the CSA 68020/68881 boards in the UNIXpc.

I just happen to have one here (when I replaced the 020/881 in one of my
Amigas with an 030), so I'll plug it into one of the "extra" UNIXpcs here
and see what happens!

Only see two problems:

1) the CSA daughterboard is tall, so will have to run with the UNIXpc's
   hood open, and

2a) HOW can I issue the MOVEC to enable the 020's cache? (since it's a
    privileged instruction)
2b) HOW to handle the stack frame differences upon interrupts?

Thad Floryan [thad@cup.portal.com (OR) ...!sun!portal!cup.portal.com!thad]