[comp.arch] Unusual instructions and constructions

hrubin@pop.stat.purdue.edu (Herman Rubin) (03/09/91)

There clearly is a major disagreement on what "should" be in an architecture
or language.  One of the arguments given against including assembler code is
that compiler optimization is at least made more difficult, and portability is
lost.

What people like Montgomery, Silverman, and I, as well as many others, have
pointed out is that if the operation is not in the language, the compiler
cannot do a good job of getting the architecture to do it.  Furthermore, 
while one can always simulate what is wanted by a clumsy sequence of operations
in the language, this is at least extremely likely to generate efficient code.

Optimizing compilers "know" about certain means of translating the language
into organized combinations of hardware operations, and have some abilities
to pick efficient ones.  But they cannot include things about which they do
not have any cognizance.

The "solution" I suggest is to allow the programmer, etc., to set up idioms 
in whatever syntax is easiest to use for that programmer, and to provide the
translations into some adequate intermediate language.  There may be, and in
fact should be, alternate translations.  Even the addition of two vectors to
produce a result should have different C code on different machines.

Presumably, those types of expressions which are found useful will eventually
get into the languages.  But the expressions themselves will have considerable
portability, although if the expression is unknown in the target dialect, a
dictionary will have to be provided.

There are two reasons for posting this to comp.arch as well as comp.lang.misc.
For one, much of the discussion has been there, and it seems that many of the
posters there consider language architecture to be architecture.  For another,
by having this type of "portable" representation of what people want computed,
hardware designers may learn something about costs and tradeoffs which they are
not getting now.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

gsteckel@vergil.East.Sun.COM (Geoff Steckel - Sun BOS Hardware CONTRACTOR) (03/09/91)

In article <7499@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
>There clearly is a major disagreement on what "should" be in an architecture
>or language.  One of the arguments given against including assembler code is
>that compiler optimization is at least made more difficult, and portability is
>lost.

One thing lost in the recent debate is the possibility of applying `object'
methodology to the complex operation debate.  The C language already has
the ability to return a structure from a function; the `divrem' function
could return   struct div_and_rem { int quotient ; int remainder } ;

C++ (admitting its many faults) can implement this sort of operation easily.
The usage (not the formal definition) of the language (and object-like languages
in general) includes the idea of `standard object heirarchies'.  These are
currently distributed for C++ from a number of sources.  This seems to be
a way for the {numerical analysts, crypto specialists, molbiogeneticists, etc.}
to introduce their special operators, distribute them, standardize them, etc.

For efficiency of human time, the users of the extended functionality would
have to organize a bit to coordinate development and distribution of prototypes
and specifications.  Once common sets of these functions stabilized a bit,
it would then be more likely that vendors would be willing to invest in special
efforts to improve versions for particular hardware.

Just a thought...
	geoff steckel (gwes@wjh12.harvard.EDU)
			(...!husc6!wjh12!omnivore!gws)
Disclaimer: I am not affiliated with Sun Microsystems, despite the From: line.
This posting is entirely the author's responsibility.

hrubin@pop.stat.purdue.edu (Herman Rubin) (03/10/91)

In article <4748@eastapps.East.Sun.COM>, gsteckel@vergil.East.Sun.COM (Geoff Steckel - Sun BOS Hardware CONTRACTOR) writes:
> In article <7499@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> >There clearly is a major disagreement on what "should" be in an architecture
> >or language.  One of the arguments given against including assembler code is
> >that compiler optimization is at least made more difficult, and portability is
> >lost.
> 
> One thing lost in the recent debate is the possibility of applying `object'
> methodology to the complex operation debate.  The C language already has
> the ability to return a structure from a function; the `divrem' function
> could return   struct div_and_rem { int quotient ; int remainder } ;
> 
> C++ (admitting its many faults) can implement this sort of operation easily.

			...........................

These suggestions are not enough, for two reasons.  Structs are not adequate
for two reasons.  One is that the results need not be stored, but frequently,
possibly even predominately, they should be kept in registers for proximate
use.  The other is that struct is too clumsy, as there is no automatic reason
(although in some cases hardware may do it for simplicity) to assign adjacent
locations for the results.  Despite some formal similarities, a list is not a
struct.

Some examples of this are dividing float by float with an integer quotient and
a floating remainder, which is a widely used operation, and the frexp function
in C, which writes a float as as one whose magnitude is in [.5,1) and puts the
power of 2 into a memory location

	y = frexp(x,&n)

instead of

	y,n = frexp(x)

The other is notation.  I see not good reason why I should not be able to write

	q,r = a/b

for both the integer and the floating case, with q being the integer quotient in
both cases.  I would like to write

	exp,mant =UP x,

where exp is the exponent and mant the mantissa for the floating point x, etc.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

chip@tct.uucp (Chip Salzenberg) (03/12/91)

According to hrubin@pop.stat.purdue.edu (Herman Rubin):
>The "solution" I suggest is to allow the programmer, etc., to set up idioms 
>in whatever syntax is easiest to use for that programmer, and to provide the
>translations into some adequate intermediate language.

A spec, Herman.  Surely you can afford some few hours out of your busy
academic schedule to write a spec.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/12/91)

In article <7571@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> The other is that struct is too clumsy, as there is no automatic reason
> (although in some cases hardware may do it for simplicity) to assign adjacent
> locations for the results.  Despite some formal similarities, a list is not a
> struct.

Data point: Q has both ``struct'' and ``bstruct''. The former makes no
guarantees about memory arrangement; the latter makes similar guarantees
to ANSI C's struct. As the language's sole user for now, I can assure
you that it's normal practice to have a function return a struct, and to
write (q,r) = div(a,b). (Yes, you can make a struct of lvalues.)

In the C code output by q2c, the place to store a struct returned from
a function is currently passed as an initial argument, in this case
pointers to q and r. Somewhere in my optimization list is having a
function use registers for this when possible.

---Dan

alex@vmars.tuwien.ac.at (Alexander Vrchoticky) (03/13/91)

hrubin@pop.stat.purdue.edu (Herman Rubin) writes:

> Even the addition of two vectors to
> produce a result should have different C code on different machines.

For the purposes of design diversity? Are you sure you did not want to
type `machine' instead of `C' there?

> But the expressions themselves will have considerable
> portability, although if the expression is unknown in the target dialect, a
> dictionary will have to be provided.

Was hat man im Zusammenhang mit Compilertechnologie unter einem `Dictionary'
zu verstehen? Unter `Portabilitaet von Ausdruecken'?

[sorry, i could not resist :-)]

--
Alexander Vrchoticky            | alex@vmars.tuwien.ac.at
TU Vienna, CS/Real-Time Systems | +43/222/58801-8168
"those who feel they're touched by madness, sit down next to me" (james)

jbuck@galileo.berkeley.edu (Joe Buck) (03/14/91)

In article <7571@mentor.cc.purdue.edu>, hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
|> In article <4748@eastapps.East.Sun.COM>, gsteckel@vergil.East.Sun.COM (Geoff Steckel - Sun BOS Hardware CONTRACTOR) writes:
|> > One thing lost in the recent debate is the possibility of applying `object'
|> > methodology to the complex operation debate.  The C language already has
|> > the ability to return a structure from a function; the `divrem' function
|> > could return   struct div_and_rem { int quotient ; int remainder } ;

|> These suggestions are not enough, for two reasons.  Structs are not adequate
|> for two reasons.  One is that the results need not be stored, but frequently,
|> possibly even predominately, they should be kept in registers for proximate
|> use.  The other is that struct is too clumsy, as there is no automatic reason
|> (although in some cases hardware may do it for simplicity) to assign adjacent
|> locations for the results.  Despite some formal similarities, a list is not a
|> struct.

Who said that a struct needs adjacent memory locations?  gcc returns a two-int
(or two-float, or two-double) struct in a pair of registers.  Functions can
be declared inline, so you might even be able to return three or more results
this way.  It really does behave like a list.

Check out this short program:

struct div_and_rem {
	int quotient;
	int remainder;
 };

inline struct div_and_rem divide(int dividend, int divisor) {
	struct div_and_rem t;
	t.quotient = dividend/divisor;
	t.remainder = dividend % divisor;
	return t;
}

foop (int x, int y) {
	struct div_and_rem t;
	t = divide(x,y);
	printf ("%d %d\n", t.quotient, t.remainder);
}

For a 68020, gcc generates

LC0:
	.ascii "%d %d\12\0"
	.even
.globl _foop
_foop:
	link a6,#0
	moveml #0x3800,sp@-
	movel a6@(8),d0
	movel a6@(12),d1
	movel d0,d4
	divsl d1,d4
	movel d4,d2
	divsll d1,d1:d0
	movel d1,d3
	movel d3,sp@-
	movel d2,sp@-
	pea LC0
	jbsr _printf
	moveml a6@(-12),#0x1c
	unlk a6
	rts

It's not optimal: there are two divides.  Still, if you write kindly
I'll bet you could talk RMS into seeing if he could recognize the
pattern and produce one divide in gcc 2.0 (or for some higher number).
But notice the important thing:
the struct has vanished!  The routine truly returns two results,
in registers, without requiring adjacent memory locations.



--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck

sef@kithrup.COM (Sean Eric Fagan) (03/14/91)

In article <11964@pasteur.Berkeley.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
>Check out this short program:

[code deleted]
>It's not optimal: there are two divides.  Still, if you write kindly
>I'll bet you could talk RMS into seeing if he could recognize the
>pattern and produce one divide in gcc 2.0 (or for some higher number).


foop:
	pushl %ebp
	movl %esp,%ebp
	pushl %ebx
	movl 8(%ebp),%eax
	cltd
	idivl 12(%ebp)
	movl %eax,%ecx
	movl %edx,%ebx
	pushl %ebx
	pushl %ecx
	pushl $.LC0
	call printf
	leal -4(%ebp),%esp
	popl %ebx
	leave
	ret

It was already done, for gcc 1.3[89].  Good work, eh?  Yes, the code could
be optimal:  gcc could look at the entire function, and not bother moving
from eax to ecx, just pushing them directly.  But those are small amounts
(two or three cycles, I believe), and are *much* better than the 15+ cycles
the extra divide would have taken.

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

hrubin@pop.stat.purdue.edu (Herman Rubin) (03/15/91)

In article <1991Mar14.013109.16636@kithrup.COM>, sef@kithrup.COM (Sean Eric Fagan) writes:
> In article <11964@pasteur.Berkeley.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
> >Check out this short program:
> 
> [code deleted]
> >It's not optimal: there are two divides.  Still, if you write kindly
> >I'll bet you could talk RMS into seeing if he could recognize the
> >pattern and produce one divide in gcc 2.0 (or for some higher number).
> 
> 
> foop:
| 	pushl %ebp
| 	movl %esp,%ebp
| 	pushl %ebx
| 	movl 8(%ebp),%eax
| 	cltd
| 	idivl 12(%ebp)
| 	movl %eax,%ecx
| 	movl %edx,%ebx
| 	pushl %ebx
| 	pushl %ecx
| 	pushl $.LC0
| 	call printf
| 	leal -4(%ebp),%esp
| 	popl %ebx
| 	leave
| 	ret
> 
> It was already done, for gcc 1.3[89].  Good work, eh?  Yes, the code could
> be optimal:  gcc could look at the entire function, and not bother moving
> from eax to ecx, just pushing them directly.  But those are small amounts
> (two or three cycles, I believe), and are *much* better than the 15+ cycles
> the extra divide would have taken.

This example has far too many loads and stores.  Possibly this MIGHT not be
too important for a division, but how about something like frexp?  The 
operations may be register-register, in which case all these loads and
stores are inappropriate.  Also, something this simple should be inlined;
if a subroutine call, there is the additional save/restore overhead which
has to be done somewhere.

The real need is for the languages and compilers to allow the user to 
introduce idioms, with translation into machine primitives.  In the above
example, idivl is such a primitive, and should be considered no differently
than the various types of subtraction.  The relevant idiom in the above 
example would be 

	q,r = x/y,

where the / is overloaded some more.

If instead r, x, and y were floating point, and q integer, the code would be
quite different, 

Mathematical notation has been developing over centuries, and we still see
many new idioms and overloadings.  It is not necessary to have a committee
to decide what notation will be allowed and what will not.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

sef@kithrup.COM (Sean Eric Fagan) (03/15/91)

In article <7850@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
>| 	pushl %ebp
>| 	pushl %ebx
>| 	movl 8(%ebp),%eax
>| 	idivl 12(%ebp)
>| 	pushl %ebx
>| 	pushl %ecx
>| 	pushl $.LC0
>| 	leal -4(%ebp),%esp
>| 	popl %ebx
>> 
>This example has far too many loads and stores.  

9 memory references.  4 necessary for the calling sequence gcc conforms to.
Three necessary to call another function, since the 'bcs' does not specify
calling routines with values in registers.   Two more because arguments
passed in are not in registers.  Leaving a total of 0 unnecessary loads and
stores.  Could this be improved?  Certainly.  But not by much.

>Possibly this MIGHT not be
>too important for a division, but how about something like frexp?  

I think it was frexp() that I wrote for berkeley using gcc with inline
assembly.  Uhm... I think it had 7 loads and stores, all but two or three of
which would disappear if the function got inlined and optimized.

>The 
>operations may be register-register, in which case all these loads and
>stores are inappropriate.  

Herman:  where are you supposed to get the values from?  Magic?

>Also, something this simple should be inlined;
>if a subroutine call, there is the additional save/restore overhead which
>has to be done somewhere.

Jesus.  Guess what, herman:  the routine *was* inlined.  Take a look at the
original source code again.

I think you're just complaining because you seem to thing there's something
wrong, when there really isn't.

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (03/15/91)

In article <2378@tuvie.UUCP> alex@vmars.tuwien.ac.at (Alexander Vrchoticky) writes:
> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> > Even the addition of two vectors to
> > produce a result should have different C code on different machines.
> For the purposes of design diversity? Are you sure you did not want to
> type `machine' instead of `C' there?

#ifdef VECTOR
   for (i = 0; i < k; i++) y[i] = neg(x[i + r - k]);
   for (i = k; i < r; i++) y[i] = x[i - k];
   for (i = 0; i < r; i++) x[i] = y[i];
#else
   for (i = 0; i < k; i++) y[i] = x[i + r - k];
   for (i = r - 1; i >= k; i--) x[i] = x[i - k];
   for (i = 0; i < k; i++) x[i] = neg(y[i]);
#endif

Well, okay, this is negacyclic rotation, but the point is the same.

---Dan

hrubin@pop.stat.purdue.edu (Herman Rubin) (03/15/91)

In article <2378@tuvie.UUCP>, alex@vmars.tuwien.ac.at (Alexander Vrchoticky) writes:
> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:
> 
> > Even the addition of two vectors to
> > produce a result should have different C code on different machines.
> 
> For the purposes of design diversity? Are you sure you did not want to
> type `machine' instead of `C' there?

My statement is correct as it stands.  The optimal C code to add two vectors
is different on different machines, strange as it may seem.  The different
codes do exactly the same thing, IF all one wants to do is to add the vectors.
If the index has to be used, or in some other cases, one code may be better
than another because of other considerations.

> > But the expressions themselves will have considerable
> > portability, although if the expression is unknown in the target dialect, a
> > dictionary will have to be provided.
> 
> Was hat man im Zusammenhang mit Compilertechnologie unter einem `Dictionary'
> zu verstehen? Unter `Portabilitaet von Ausdruecken'?
> 
> [sorry, i could not resist :-)]

Ich verstehe Deutsch, aber nicht sehr gut.

A human has little difficulty in translating between two computer languages,
and not too much problem between "natural" languages.  Computer programs seem
to have much more of a problem.  I have relatively little difficulty in
translating between mathematical constructions and HLL or machine constructions,
but the current communication channels lack the flexibility for even fairly
efficient compilers to take over.  The compiler writer has provided the 
translation between x = y-z and machine code in such a way that the compiler
can take into account types, locations, etc., in producing good code.  The
same type of translation should be available for other constructs.

If a programmer must make the detailed translation, or even does this for
reasons of efficiency, in each case, the language designers and architects
will not see the uses of the constructs.  If the programmer instead can use
the dictionary, the construct is apparent, rather than its expansion, which
is likely to mean little.  How many would recognize the code for B'A^(-1)C,
A positive definite, without expecting that to be done?
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

rcg@lpi.liant.com (Rick Gorton) (03/16/91)

The example was:

foop (int x, int y) {
        struct div_and_rem t;
        t = divide(x,y);
        printf ("%d %d\n", t.quotient, t.remainder);
	}

The 80X86 assembly code (from gcc 1.3) was:
> foop:
> 	pushl %ebp		PPPP
> 	movl %esp,%ebp		PPPP
> 	pushl %ebx		PPPP
> 	movl 8(%ebp),%eax
> 	cltd
> 	idivl 12(%ebp)
> 	movl %eax,%ecx		XXXX
> 	movl %edx,%ebx		XXXX
> 	pushl %ebx		XXXX
> 	pushl %ecx		XXXX
> 	pushl $.LC0		XXXX
> 	call printf		XXXX
> 	leal -4(%ebp),%esp	EEEE
> 	popl %ebx		EEEE
> 	leave			EEEE
> 	ret			EEEE
>
In article <7850@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu
 (Herman Rubin) writes:
>This example has far too many loads and stores.  Possibly this MIGHT not be
>too important for a division, but how about something like frexp?  The 
>operations may be register-register, in which case all these loads and
>stores are inappropriate.  Also, something this simple should be inlined;
>if a subroutine call, there is the additional save/restore overhead which
>has to be done somewhere.

Sir: you have TOTALLY missed the boat here.  The example was written as
a subroutine/function.  It was NOT written as a piece of inline code.
See the lines marked "XXXX"? Those are for the user specified, machine
translatable, machine primitive called "printf".  It is a subroutine
call to print the quotient and remainders.  The lines marked "PPPP" are
the subroutine PROLOGUE instructions, which are required because of the
nature of the 80X86 architecture and calling conventions, and because this
routine is NOT a leaf routine (it calls another routine).  The
instructions marked "EEEE" are the subroutine EPILOGUE instructions.

So after getting to the heart of the matter, it takes 3 instructions
to perform the "divide" idiom when the two inputs are ON THE STACK:
> 	movl 8(%ebp),%eax
> 	cltd
> 	idivl 12(%ebp)

You have been exceptionally vocal about the poor code quality of
compilers, and yet you come up with a complaint like this?
Your commentary leads me to believe that you are annoyed that
your code executes slowly, but that you haven't made the slightest
effort to understand the reasons why.  Kind of like someone
putting water in the gas tank of their car, and then telling
the manufacturer of the engine that they need to improve the
efficiency of the engine, because it works so poorly.
And yet you want me, the compiler writer to try to fix your
problem?  Think again.

What programming language do you use?  What kind of hardware
do you execute your programs on?  Do you actually WRITE code at all?
Or do you just complain about how slowly it executes?

>Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
>Phone: (317)494-6054
>hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)

-- 
Richard Gorton               rcg@lpi.liant.com  (508) 626-0006
Language Processors, Inc.    Framingham, MA 01760
Hey!  This is MY opinion.  Opinions have little to do with corporate policy.

hrubin@pop.stat.purdue.edu (Herman Rubin) (03/16/91)

In article <1991Mar14.195853.27398@kithrup.COM>, sef@kithrup.COM (Sean Eric Fagan) writes:
> In article <7850@mentor.cc.purdue.edu> hrubin@pop.stat.purdue.edu (Herman Rubin) writes:

			......................

> >This example has far too many loads and stores.  
> 
> 9 memory references.  4 necessary for the calling sequence gcc conforms to.
> Three necessary to call another function, since the 'bcs' does not specify
> calling routines with values in registers.   Two more because arguments
> passed in are not in registers.  Leaving a total of 0 unnecessary loads and
> stores.  Could this be improved?  Certainly.  But not by much.

> >Possibly this MIGHT not be
> >too important for a division, but how about something like frexp?  
> 
> I think it was frexp() that I wrote for berkeley using gcc with inline
> assembly.  Uhm... I think it had 7 loads and stores, all but two or three of
> which would disappear if the function got inlined and optimized.
> 
> >The 
> >operations may be register-register, in which case all these loads and
> >stores are inappropriate.  
> 
> Herman:  where are you supposed to get the values from?  Magic?
> 
Computing q,r = a/b should not even consider a subroutine call.  The arguments,
or at least most of them, are likely to be the results of previous operations,
and hence already in registers.  The results are likely to be used in proximal
instructions, and hence kept in registers rather than being stored.  This IS
what decent compilers do for the "standard" operations of + - * / ^ | &.  Other
operations should be treated in the same way, and not as subroutine calls.

> >Also, something this simple should be inlined;
> >if a subroutine call, there is the additional save/restore overhead which
> >has to be done somewhere.
> 
> Jesus.  Guess what, herman:  the routine *was* inlined.  Take a look at the
> original source code again.

It is inlined, but it is still in the nature of a subroutine call.  These
"unusual" constructs should be treated as having general arguments, usually
not in specified locations.  The example given loaded the arguments and
stored the results, even if that were inlined.  Somewhat better would be
to have the arguments in registers specific to the inlining procedure, and
the results in other specified registers.  This is not what a decent compiler
now does for the operations it understands.  The expansion should allow adding
to THAT set of operations.

To summarize, what should be provided is to allow the compiler to accept the
idiom producer's insight into the various ways the job can be done using the
machine instructions or previous idioms, and optimize using this information.
As I understand an inlined subroutine call, it could not merely issue the
instruction 
		idivl a,b,q,r

or for some machines something similar to

		idivl a,b,q
		movl  q',r

where q' is the register adjacent to q, assuming that things were in the 
appropriate registers, and only load/store as needed.
--
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907-1399
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)   {purdue,pur-ee}!l.cc!hrubin(UUCP)