[comp.arch] negative addresses

wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) (05/09/88)

Has anyone ever seen a machine with "negative addresses", that is, one
where the address space is -2**31..2**31-1 rather than 0..2*32-1??
Any thoughts on what the problems with such a scheme might be (or are)?

Why ask such a question, you ask -- well, I'm trying to remove unsigned
arithmetic from WM, and as far as I can tell, the primary (only?) use
of unsigned arithmetic is for address computations. Soooooo...

Bill Wulf

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/10/88)

In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
| Has anyone ever seen a machine with "negative addresses", that is, one
| where the address space is -2**31..2**31-1 rather than 0..2*32-1??
| Any thoughts on what the problems with such a scheme might be (or are)?

  "bits is bits," but I suspect that a lot of programs will have trouble
with non-contiguous addressing. Address wrap forces the lowest address
to "follow" the highest address, which may make io interesting.
| 
| Why ask such a question, you ask -- well, I'm trying to remove unsigned
| arithmetic from WM, and as far as I can tell, the primary (only?) use
| of unsigned arithmetic is for address computations. Soooooo...

  My gut feeling is that this is not correct, but I have no metrics at
this time to confirm or deny what you say.
| 
| Bill Wulf

-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

oconnor@sungoddess.steinmetz (Dennis M. O'Connor) (05/10/88)

An article by davidsen@crdos1.UUCP (bill davidsen) says:
] In article <...> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
] | Has anyone ever seen a machine with "negative addresses", that is, one
] | where the address space is -2**31..2**31-1 rather than 0..2*32-1??
] | Any thoughts on what the problems with such a scheme might be (or are)?
] 
]   "bits is bits," but I suspect that a lot of programs will have trouble
] with non-contiguous addressing. Address wrap forces the lowest address
] to "follow" the highest address, which may make io interesting.

The lowest address ( -2**N ) does NOT follow the highest ( 2**N-1 )
in the RPM40 memory scheme. An attempt to reach the negative area
by adding two positive values generates an overflow. This is good.
An attempt to cross zero is perfectly legal. Static storage is
centered around zero, accessable using small addresses.

Existing programs may not make good use of the negative areas.
But you should be able to solve this at link time, or maybe even
at load time. At compile time, just put the heap in negative
memory and the stack in positive. Only the compiler/linker/loader
will ever know.

] | 
] | Why ask such a question, you ask -- well, I'm trying to remove unsigned
] | arithmetic from WM, and as far as I can tell, the primary (only?) use
] | of unsigned arithmetic is for address computations. Soooooo...

]   My gut feeling is that this is not correct, but I have no metrics at
] this time to confirm or deny what you say.

Unsigned arithmetic takes very little additional hardware. I've many
times been annoyed at not being able to have integer types that range
from 0..2**N-1, where N is the word size. RISC philosophy would
be to leave it in if it does not slow the machine or use a lot of
space on the chip. So you should leave it in.

] | Bill Wulf
] 	bill davidsen		(wedu@ge-crd.arpa)
--
 Dennis O'Connor   oconnor%sungod@steinmetz.UUCP  ARPA: OCONNORDM@ge-crd.arpa
    "The purpose of socialization is to teach wolves that they are sheep."

mcdonald@uxe.cso.uiuc.edu (05/10/88)

>Why ask such a question, you ask -- well, I'm trying to remove unsigned
>arithmetic from WM, and as far as I can tell, the primary (only?) use
>of unsigned arithmetic is for address computations. Soooooo...

>Bill Wulf

You must be kidding! The primary use of unsigned arithmetic is counting
numbers, of course, but there are jillions of other uses. Don't you
want to have, with 16bit words, 32768 ABOVE 32767, and the corresponding
results with other word sizes?

rrr@naucse.UUCP (Bob Rose ) (05/10/88)

In article <2393@uvacs.CS.VIRGINIA.EDU>, wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
> Has anyone ever seen a machine with "negative addresses", that is, one
> where the address space is -2**31..2**31-1 rather than 0..2*32-1??
> Any thoughts on what the problems with such a scheme might be (or are)?

Is there really any difference?

> Why ask such a question, you ask -- well, I'm trying to remove unsigned
> arithmetic from WM, and as far as I can tell, the primary (only?) use
> of unsigned arithmetic is for address computations. Soooooo...

You must be _REAL_ disperate for silicon!
Signed multiply, add and substract is the same as the unsigned counterparts
except for maybe a condition code bit. Also signed divide normally has
unsigned divide somewhere deep inside of it and its the signed divide
that normally gets left out of the instruction set (VAX of course left out
the unsigned divide but what the hey, the MC88000 has signed divide
but it just does an unsigned divide or traps on negative numbers.) Also
the primary use of unsigned numbers isn't just for address computations.
Some of us use all the bits we can get (flags, fixed-point ...)

Robert R. Rose
Northern Arizona University, Box 15600
Flagstaff, AZ 86011
                    .....!ihnp4!arizona!naucse!rrr

mahar@weitek.UUCP (Mike Mahar) (05/10/88)

In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
>Has anyone ever seen a machine with "negative addresses", that is, one
>where the address space is -2**31..2**31-1 rather than 0..2*32-1??
>Any thoughts on what the problems with such a scheme might be (or are)?
>
>Why ask such a question, you ask -- well, I'm trying to remove unsigned
>arithmetic from WM, and as far as I can tell, the primary (only?) use
>of unsigned arithmetic is for address computations. Soooooo...
>
>Bill Wulf

A pretty good arguement can be made that the 68000 is a signed address
machine.  And the address displacements are signed.  There is even a short
absolute addressing mode. It uses an absolute 16-bit signed address.
Most compilers only use 32K of that address because they want memory to
start at 0.  The addressing modes of the 68000 are more orthognal if you
assume that address 0 is the middle of memory rather than the beginning.

-- 
	Mike Mahar
	UUCP: {turtlevax, cae780}!weitek!mahar

bcase@Apple.COM (Brian Case) (05/11/88)

In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
>Has anyone ever seen a machine with "negative addresses", that is, one
>where the address space is -2**31..2**31-1 rather than 0..2*32-1??
>Any thoughts on what the problems with such a scheme might be (or are)?

Check out the Elxsi machine.  It has "negative" addressing.  It seems that
with negative addressing, you can get a simple address checking for free.
Put the OS Kernel in negative space, e.g.

>...as far as I can tell, the primary (only?) use
>of unsigned arithmetic is for address computations.

Yeah, same here, unless it's for some obscure purpose or is supported
directly by the source language.

henry@utzoo.uucp (Henry Spencer) (05/11/88)

> ... I'm trying to remove unsigned
> arithmetic from WM, and as far as I can tell, the primary (only?) use
> of unsigned arithmetic is for address computations. Soooooo...

I've thought for a long time that unsigned arithmetic is basically a relic
of the 16-bit days, when the difference between 15 and 16 really mattered,
and that the difference between 31 and 32 is much less significant.  I don't
see any compelling need for unsigned arithmetic on a 32-bit machine with
31-bit addresses or signed addresses... except that several of the newer
programming languages, notably C, absolutely require it.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

tim@amdcad.AMD.COM (Tim Olson) (05/11/88)

Note: I am also including comp.lang.c, since this also pertains to C...

In article <2393@uvacs.CS.VIRGINIA.EDU>, wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
| Has anyone ever seen a machine with "negative addresses", that is, one
| where the address space is -2**31..2**31-1 rather than 0..2*32-1??
| Any thoughts on what the problems with such a scheme might be (or are)?

I can't think of any real problems offhand, but this representation
affects a few things:

Your program's virtual address space probably starts at -2**31,
rather than 0 (to give you the full range).  This means that C-language
null pointers, because they are defined never to point to a valid
address, will probably have to be something other than the standard
all-zero bit representation.  This is not a real problem, as C allows
this.  However, it complicates the compiler somewhat (having to detect
the assignment/comparison of a pointer and the integer constant 0 as a
special case).  Also, buggy programs that used to run with references
through uninitialized static pointers might break in horrible ways (this
is not necessarily bad! ;-)

| Why ask such a question, you ask -- well, I'm trying to remove unsigned
| arithmetic from WM, and as far as I can tell, the primary (only?) use
| of unsigned arithmetic is for address computations. Soooooo...

What about support for explicit unsigned types in HLLs?  This would only
work if you limited the range of "unsigned" values to 0..2**31-1, rather
than the full 32-bit range.  However, my copy of the ANSI C Draft Spec
(which might be out of date) says:

	"For |signed char| and each type of |int|, there is a
	corresponding unsigned type (declared with the keyword
	|unsigned|) that utilizes the same amount of storage including
	sign information.  The set of nonnegative values of a signed
	type is a subset of its corresponding unsigned type."

	-- Tim Olson
	Advanced Micro Devices
	(tim@delirun.amd.com)

schooler@oak.bbn.com (Richard Schooler) (05/11/88)

I've seen too many other uses of unsigned arithmetic to contemplate
removing unsigned arithmetic.  Some quantities are inherently
unsigned, such as distance and degrees Kelvin.  To many industrial
applications, the difference between a 15-bit and a 16-bit integer is
a critical one.  Even 31 vs. 32 bits can make a difference,
particularly in fixed-point work, where resolution counts as well as
range.  Unsigned integers are also good for bit-diddling in a language
that doesn't support it explicitly.

	-- Richard Schooler
	schooler@bbn.com

tve@alice.UUCP (05/11/88)

In article <2393@uvacs.CS.VIRGINIA.EDU>, wulf@uvacs.UUCP writes:
> Has anyone ever seen a machine with "negative addresses", that is, one
> where the address space is -2**31..2**31-1 rather than 0..2*32-1??

Is it a hardware or a software issue?
If you use physical addressing and your memory starts at zero, it is a
board hardware issue.
If you use virtual memory, it's the way the software sets up the process
and the MMU.

I think, having negative addresses is mainly a software issue, namely
having process code and static data start at -2**31 and stack start
at 2**31 (going down) (It's not exactly these numbers, but it's the general
idea).
The main hardware obstacle I see, is "short addressing". You might be
interested in the fact, that the 680x0 _sign_ extends short addresses,
which is exactly what you want for your negative address scheme (as
opposed to _zero_ extension). In fact, I seem to remember, that the
68000 manual says the address range for short adresses (16 bits)
is -2**16..2**16.

So take a 680x0 and try your kernel and compiler! If you have the
overflow/condition-code bits sorted out, your main problem will
be convincing users (and ported programs) to understand the
negative addresses!

Thorsten von Eicken		research!tve or tve@research.att.com
AT&T Bell Laboratories
Murray Hill, NJ

przemek@gondor.cs.psu.edu (Przemyslaw Klosowski) (05/11/88)

In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
>Has anyone ever seen a machine with "negative addresses", that is, one
>where the address space is -2**31..2**31-1 rather than 0..2*32-1??
>Any thoughts on what the problems with such a scheme might be (or are)?
>
For one, all those nice addressing modes with index scaled by the size of the
data structure:
	EA = (base) + (index)<<(ln2 size)
won't work, unless index is restricted.



				przemek@psuvaxg.bitnet
				psuvax1!gondor!przemek

stuart@cs.rochester.edu (Stuart Friedberg) (05/11/88)

In article <389@attila.weitek.UUCP>, mahar@weitek.UUCP (Mike Mahar) writes:
> A pretty good arguement can be made that the 68000 is a signed address
> machine.  And the address displacements are signed.  There is even a short
> absolute addressing mode. It uses an absolute 16-bit signed address.

Right.  And a machine that makes use of that is the BBN Butterfly
multiprocessor.  Both the "positive" and "negative" portions of that
signed address space are used to efficiently access "Subspace Zero", where
magic memory mapped functions, implemented by a bit-slice coprocessor,
hang out.  However, the machine presents a conventional memory map (all
positive) to programmers, so Subspace Zero addresses have to be mapped
in at the very bottom and very top of the address space.

For this particular purpose, it would have been more symmetric, but far
less conventional, to regard memory space as signed.  It would have
been far more conventional, and convenient, FOR THIS PARTICULAR PURPOSE
if the 68000 provided a 16-bit absolute unsigned address, instead.

Stu Friedberg  {ames,cmcl2,rutgers}!rochester!stuart  stuart@cs.rochester.edu

jesup@pawl15.pawl.rpi.edu (Randell E. Jesup) (05/11/88)

>In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
>| Has anyone ever seen a machine with "negative addresses", that is, one
>| where the address space is -2**31..2**31-1 rather than 0..2*32-1??
>| Any thoughts on what the problems with such a scheme might be (or are)?

>| Bill Wulf

	The rpm-40 has negative addressing.  (-2**N..2**N-1, where 2**(N+1) is
the process instruction or data space size.

     //	Randell Jesup			      Lunge Software Development
    //	Dedicated Amiga Programmer            13 Frear Ave, Troy, NY 12180
 \\//	beowulf!lunge!jesup@steinmetz.UUCP    (518) 272-2942
  \/    (uunet!steinmetz!beowulf!lunge!jesup) BIX: rjesup
(-: The Few, The Proud, The Architects of the RPM40 40MIPS CMOS Micro :-)

rk@lexicon.UUCP (Bob Kukura) (05/11/88)

From article <9485@apple.Apple.Com> by bcase@Apple.COM (Brian Case):
>In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
>>...as far as I can tell, the primary (only?) use
>>of unsigned arithmetic is for address computations.
>
>Yeah, same here, unless it's for some obscure purpose or is supported
>directly by the source language.

Unsigned arithmetic is used in array index calculations all the time.
You only have to check one bound if the first element is at zero and
the index is unsigned.  I'm sure languages like Pascal can use the
condition codes to check for negative indices, but condition codes are
not available to the programmer in C.
-- 
-Bob Kukura		uucp: {husc6,linus,harvard,bbn}!spdcc!lexicon!rk
			phone: (617) 891-6790

nather@ut-sally.UUCP (Ed Nather) (05/12/88)

In article <9485@apple.Apple.Com>, bcase@Apple.COM (Brian Case) writes:
> In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
> 
> >...as far as I can tell, the primary (only?) use
> >of unsigned arithmetic is for address computations.
> 
> Yeah, same here, unless it's for some obscure purpose or is supported
> directly by the source language.

Your CS background is showing, gentlemen.  When I gather data from stars, I
count the precious photons one at a time, and use unsigned arithmetic to
massage them, since there are no negative photons (unlike negative addresses).
I wouldn't classify this as an obscure purpose, but someone else might.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.AS.UTEXAS.EDU

radford@calgary.UUCP (Radford Neal) (05/12/88)

> Has anyone ever seen a machine with "negative addresses", that is, one
> where the address space is -2**31..2**31-1 rather than 0..2*32-1??
> Any thoughts on what the problems with such a scheme might be (or are)?
> 
> Why ask such a question, you ask -- well, I'm trying to remove unsigned
> arithmetic from WM, and as far as I can tell, the primary (only?) use
> of unsigned arithmetic is for address computations. Soooooo...
> 
> Bill Wulf


The 68000 has negative addresses when you're using it as a machine with
a 16-bit address space.

What do I mean by this? Well, the instructions all sign extend when
moving a 16-bit address into a 32-bit address register. E.g. 

   move.w a0@,a0

fetches a word from the address in a0, _sign extends it to 32-bits_
and replaces a0 with the result. So if you try to treat the 68000 as
a 16-bit address machine (for a speed gain), you must consider the
addresses to be signed, or go to a lot of effort to undo these sign 
extensions at times.

I've no idea if anyone writes 68000 programs like this. I was going to
once, but in the end decided I needed more than 2^16 bytes of memory.

I see only one problem with negative addresses. A C implementation will be 
much assisted by an addressing scheme in which (char*)0 has all zeros as
its bit pattern. In a machine with negative addresses, arrainging this
might be annoying. Then again, it might be no problem, if you decide on
a layout like, say:

   -large             one            +large

   program code       data ->      <- stack

This seems like a reasonable layout unless you're desparate for address
space and don't want the potential unused gap between the end of
program code and the start of data at address 1. (You want a one-byte
gap at address zero, of course, so that null is invalid.)

    Radford Neal

Paul_L_Schauble@cup.portal.com (05/12/88)

I routinely work on a machine that almost does this, the Honeywell-Bull
GCOS mainframes. Virtual addresses are constructed from three pieces

   Descriptor       34 bits of address
   Address register 34 bit signed offset
   Index register   34 bit signed offset
   Instruction      16 bit signed offset

Oops, I meant 4 pieces. This all works fine except for one idiody committed
by the hardware designers: the machine word is 36 bits. If you do an
'effective address to register' instruction that results in a negative
address does NOT result in a negative value in the register.

This machine does not consider the 34 bit final addresses to be signed. If I
were designing the machine again, I'd make all of the internal address
calculations be 36 bits, ending with a 36 bit negative value. The, just throw
away the negative half of the address space by making those addresses fault.
If you don't think losing the address space is reasonable (and on a machine
with a small 32 bit address space you might well not), then HB could just
make the virtual addresses run -X to +X and just adjust the base addresses
in the descriptors. Very very few slave processes would ever notice.

    Paul

henry@utzoo.uucp (Henry Spencer) (05/12/88)

> Your program's virtual address space probably starts at -2**31,
> rather than 0 (to give you the full range).  This means that C-language
> null pointers, because they are defined never to point to a valid
> address, will probably have to be something other than the standard
> all-zero bit representation.  This is not a real problem...

Unfortunately, it is a real problem, because there are zillions of
programs that implicitly assume that pointers are all-zeros.  It is
true that the language does not require it, but doing anything else
is an enormous pain in practice, according to the people who have
experimented with the idea.

Fortunately the problem can be bypassed, because there is absolutely no
reason why the null pointer has to point to the beginning of your address
space.  It is sufficient for the machine and the memory allocator to
conspire to ensure that no user data is ever allocated at location 0.
This would qualify as a nuisance, but hardly a disaster.

> ... it complicates the compiler somewhat (having to detect
> the assignment/comparison of a pointer and the integer constant 0 as a
> special case)...

Not by much.  Some machines already need such complications because their
pointers and integers are not the same size.

> ... uninitialized static pointers might break in horrible ways (this
> is not necessarily bad! ;-)

Are you going to fix all the programs that rely on it? ;-)  More to the
point, this is not an issue, because uninitialized static variables are
*not* initialized to all-zeros, they are initialized to the zero value
of their data type, which means the null pointer for pointers.  Now this
would be a bit of a pain for compilers on machines with odd representations
of the null pointer.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

ok@quintus.UUCP (Richard A. O'Keefe) (05/13/88)

In article <11571@ut-sally.UUCP>, nather@ut-sally.UUCP (Ed Nather) writes:
> In article <9485@apple.Apple.Com>, bcase@Apple.COM (Brian Case) writes:
> > In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
> > >...as far as I can tell, the primary (only?) use
> > >of unsigned arithmetic is for address computations.
> > Yeah, same here, unless it's for some obscure purpose or is supported
> > directly by the source language.
> 
> Your CS background is showing, gentlemen.  When I gather data from stars, I
> count the precious photons one at a time, and use unsigned arithmetic to
> massage them, since there are no negative photons (unlike negative addresses).

Ah, backgrounds.  "unsigned" arithmetic is NOT the same as "arithmetic on
natural numbers" (N as opposed to Z).  What it means is "modulo 2^N,
_sort of_".  I don't imagine that Ed Nather likes it when his counts
silently wrap around from 65535 to 0, but that's "unsigned" arithmetic
for you.  While I agree that it is useful to distinguish counts from
other types, I'm not convinced that this is an argument for having a
machine support "unsigned" arithmetic.  If you have two counters, what
could be more natural than taking the difference?  If I do
	unsigned int counter[2];
then
	counter[1] - counter[0]
in C is an "unsigned" quantity, which ought to come as an extremely
unpleasant shock to anyone who thought "unsigned" was good for counting.

(1) ADA requires that the basic integer types are symmetric around 0
    (with perhaps an additional negative value), so on a VAX you can
    declare a subtype 0..16#7fff_ffff# but that's as far as it goes.
    Perhaps Bill Wulf could explain that the WM is meant for ADA?
(2) I am getting sick of computers which cannot do integer arithmetic
    and won't admit their mistakes.  Floating-point was bad enough,
    but when a computer will add 1 to a positive number and give me
    a negative number it's time we cleaned up our act.  I use C every
    day, but it is an antique, and enforces too many mistakes from
    the past (such as running with integer exceptions disabled).
(3) Why should the machine's wordsize show through into a programming
    language?  It's fair enough for there to be a threshold (the
    register width) below which arithmetic is fast, but if I am
    willing to declare the ranges of my variables, why shouldn't the
    compiler handle whatever size I specify?

So my vote is 
	unsigned arithmetic, no;
	support for (compiler-generated) multi-precision operations, yes.

nather@ut-sally.UUCP (Ed Nather) (05/14/88)

In article <965@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
> 
> Ah, backgrounds.  "unsigned" arithmetic is NOT the same as "arithmetic on
> natural numbers" (N as opposed to Z).  What it means is "modulo 2^N,
> _sort of_".  

Well, er ... uh ... (blush) ...

> I don't imagine that Ed Nather likes it when his counts
> silently wrap around from 65535 to 0, but that's "unsigned" arithmetic
> for you. 

In practice my interface board watches for this overflow and yanks on a
polled interrupt line so the program can add 1 to an internal 16-bit
extension of the count.  But you make a strong point: that really
isn't the way I'd like things to behave.  I guess it's preferable to having
32768 counts represented as a negative number, but not by much.

> (2) I am getting sick of computers which cannot do integer arithmetic
>     and won't admit their mistakes.  Floating-point was bad enough,
>     but when a computer will add 1 to a positive number and give me
>     a negative number it's time we cleaned up our act.  

I agree, but I'm pretty sure floating point isn't much of an answer.  If
I try to count things by adding 1 to a floating point number, as the
count gets bigger a unit count becomes less and less significant, until
its significance is smaller than the size of the mantissa and counting 
stops completely.

Now, if we designed computers so integer word sizes were large enough to
hold the largest number we now use in floating point (ca. 2^512 or so)
then we wouldn't need a complex floating point system -- just good,
fast (wide) integer operations.  And I could count events without
constantly looking over my shoulder for problems.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.AS.UTEXAS.EDU

aglew@urbsdc.Urbana.Gould.COM (05/14/88)

>> >[Wulf]: 
>> >...as far as I can tell, the primary (only?) use
>> >of unsigned arithmetic is for address computations.
>> [Brian Case]:
>> Yeah, same here, unless it's for some obscure purpose or is supported
>> directly by the source language.
>[Ed Nather]:
>Your CS background is showing, gentlemen.  When I gather data from stars, I
>count the precious photons one at a time, and use unsigned arithmetic to
>massage them, since there are no negative photons (unlike negative addresses).
>I wouldn't classify this as an obscure purpose, but someone else might.

What do you care if you are counting in a signed integer, and just
use half the range?

gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/14/88)

In article <1988May12.162906.16901@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Unfortunately, it is a real problem, because there are zillions of
>programs that implicitly assume that pointers are all-zeros.

I don't think this is true.  How about an example?

>... uninitialized static variables are
>*not* initialized to all-zeros, they are initialized to the zero value
>of their data type, which means the null pointer for pointers.  Now this
>would be a bit of a pain for compilers on machines with odd representations
>of the null pointer.

Not that much of a problem, really.  The compiler knows about static
data at compile time, and if not explicitly initialized it can output
something like
	ptrname: .word 0xF0F0F0F0 ; null pointer pattern
in the data section of the code it generates.

ok@quintus.UUCP (Richard A. O'Keefe) (05/14/88)

In article <11592@ut-sally.UUCP>, nather@ut-sally.UUCP (Ed Nather) writes:
> In article <965@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
> 
> > (2) I am getting sick of computers which cannot do integer arithmetic
> >     and won't admit their mistakes.  Floating-point was bad enough,
> >     but when a computer will add 1 to a positive number and give me
> >     a negative number it's time we cleaned up our act.  
> 
> I agree, but I'm pretty sure floating point isn't much of an answer.

I didn't suggest it was!  No way!  [Quick exercise:  supposing 32-bit
integers and 32-bit IEEE floats, find numbers X, Y, Z such that
	X.GE.Y .AND. Y.GE.Z .AND. X.LT.Z
in a conforming Fortran-77 -- declarations may be needed.]

> Now, if we designed computers so integer word sizes were large enough to
> hold the largest number we now use in floating point (ca. 2^512 or so)

My point is that "word size" is an implementation detail which is of
interest to people writing device drivers, operating systems, compilers,
&c, but that most of my programs couldn't care less.  If I declare a
512-bit integer and the machine has 32-bit registers, it's the compiler's
job to cope.  All of the machines I am familiar with except the B6700
have support for multi-precision integer arithmetic, and the B6700 could
deal with 78-bit + sign integers anyway.  What's the good of ADDC.L and
the rest if the compiler won't generate them?

I appreciate that there are applications where the "C" model is appropriate.
But why should COBOL be the only language I can do 18-digit integer
arithmetic in?  (LISP has arbitrary precision integers and rationals, but
that requires dynamic storage management, which I also like, but one dream
at a time.)

vandys@hpindda.HP.COM (Andy Valencia) (05/15/88)

	Logitech got caught by the "NIL doesn't have to be 0" syndrome.
I think they used 0xFFFF,0xFFFF.  Turns out that the 80286
architecture traps loads of invalid segment numbers into the
segment registers, but allows 0 to be loaded, and then traps the
reference instead.  So unless you're representing at least the segment
number as 0, you're not going to survive protected mode '286.
In their new compiler I believe NIL is now 0,0xFFFF.

				Andy Valencia
				vandys%hpindda.UUCP@hplabs.hp.com

nather@ut-sally.UUCP (Ed Nather) (05/16/88)

In article <28200145@urbsdc>, aglew@urbsdc.Urbana.Gould.COM writes:
> 
> >[Ed Nather]:
> When I gather data from stars, I
> >count the precious photons one at a time, and use unsigned arithmetic to
> >massage them, since there are no negative photons (unlike negative addresses).
> What do you care if you are counting in a signed integer, and just
> use half the range?

Unfortunatly I have found no simple way to get the star to cooperate with
respect to counting rates.  Sometimes 16 bits are enough, sometimes 32
aren't, depending on the the star's brightness and the rapidity with which
it varies.

But these are details.  What we are all doing, in different disciplines, is
conforming to current computer architecture rather than cutting it to fit our
particular problem.  Compilers are just a way to insert a "virtual architecture"
in between the user and the hardware so it looks different -- friendlier to 
certain applications, usually.

We pay the cost at run-time.  If we can afford it, fine.  But we continue to
ask more and more of computers as they get faster and faster, and I doubt
this is likely to change any time soon.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.AS.UTEXAS.EDU

henry@utzoo.uucp (Henry Spencer) (05/16/88)

> Your CS background is showing, gentlemen.  When I gather data from stars, I
> count the precious photons one at a time, and use unsigned arithmetic to
> massage them, since there are no negative photons...

Your lack of CS background is showing, Ed. :-)  Just because your numbers
are always positive doesn't mean you should use unsigned data types for
them.  As people have already pointed out, nasty surprises lurk in unsigned
arithmetic, and it is potentially less efficient to boot (although on most
machines the difference, if any, is slight).  The only compelling reason
to use unsigned data types for ordinary arithmetic purposes is if you really
need that one extra bit... and if that's the case, you're probably better
off using some sort of multi-precision arithmetic package anyway, because
sooner or later you'll need another bit.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

henry@utzoo.uucp (Henry Spencer) (05/16/88)

> >Unfortunately, it is a real problem, because there are zillions of
> >programs that implicitly assume that pointers are all-zeros.
> 
> I don't think this is true.  How about an example?

Any program written by a programmer who believes the 4.3BSD manuals, or
any of their ancestors, all of which claim that the arg-list terminator
for execl is 0 rather than (char *)0.  A pox on the Berkloids for not
having fixed this long ago!

I'm not intimately acquainted with the problem myself, but I do know that
at least one computer project that wanted to use a non-zero null pointer
studied the situation and decided to change the hardware instead.

>Not that much of a problem, really.  The compiler knows about static
>data at compile time, and if not explicitly initialized it can output
>something like
>	ptrname: .word 0xF0F0F0F0 ; null pointer pattern
>in the data section of the code it generates.

True, which is why I described it as "a bit of a pain" rather than as
a significant problem.  The biggest nuisance, actually, is the loss in
effectiveness of the "BSS" optimization for object-module size.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

billo@cmx.npac.syr.edu (Bill O) (05/16/88)

In article <11618@ut-sally.UUCP> nather@ut-sally.UUCP (Ed Nather) writes:
>In article <28200145@urbsdc>, aglew@urbsdc.Urbana.Gould.COM writes:
>> 
>> >[Ed Nather]:
>> When I gather data from stars, I
>> >count the precious photons one at a time, and use unsigned arithmetic to
>> >massage them, since there are no negative photons (unlike negative addresses).
>> What do you care if you are counting in a signed integer, and just
>> use half the range?
>
>Unfortunatly I have found no simple way to get the star to cooperate with
>respect to counting rates.  Sometimes 16 bits are enough, sometimes 32
>aren't, depending on the the star's brightness and the rapidity with which
>it varies.
>

This is a perfect example of why we need higher-level languages
like lisp.  In lisp, you don't need to know the ranges of 
integers ahead of time. If your calculations overflow the
hardware representation of an interger, lisp just converts
the representation to Bignum, and works with it instead --
you may never even be aware that the conversion has occurred.

Yes, we need languages that are close to hardware (I don't think I'd
be able to write an operating system for, say, an IBM 370, in lisp.)
But much of programming could be made easier if we used (and designed)
languages more suited to the framing of algorithms rather than to the
writing of programs which run fast on particular pieces of hardware.

In fact, using such languages has the beneficial effect of encouraging
the design of hardware better suited to higher-level language
implementation -- example: lisp machines.

(NOTE: in lisp you actually get the best of both worlds. You can
fully specify types if you choose, to get faster-running code.)

karl@haddock.ISC.COM (Karl Heuer) (05/17/88)

In article <1988May15.222335.13174@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>The biggest nuisance, actually, [with a nonzero representation for NULL] is
>the loss in effectiveness of the "BSS" optimization for object-module size.

One could implement separate segments for "integral BSS", "pointer BSS", and
"floating BSS".  Mixed-type aggregate BSS would still be the compiler's
responsibility, unless you have a really smart object format.

You'd probably also catch a few programs that (improperly) assume that
"int x; char *y;" allocates adjacent memory cells.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
(In the above, "BSS" means "uninitialized static-duration data".)
--> Followup cautiously -- this article is still cross-posted <--

yuhara@ayumi.stars.flab.fujitsu.JUNET (== M. Yuhara ==) (05/17/88)

In article <2393@uvacs.CS.VIRGINIA.EDU>, wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
> Has anyone ever seen a machine with "negative addresses", that is, one
> where the address space is -2**31..2**31-1 rather than 0..2*32-1??

Yes, yes.
TRON chip deals an address as an signed integer.
On TRONCHIP32, -2**31..-1 is called Shared-semi-Space (SS) which is shared
among processes. 0..2**31-1 is called Unshared-semi-Space (US) which is
independent among processes.
(You can think of SS as System's Space, and US as User's Space).

TRON chip architecture is designed to be extensible from 32 bit address space
through 48 (TRONCHIP48) to 64 bit address space (TRONCHIP64).

If you think SS is 2**31..2*32-1, you will have difficulty when you
extend address space. But if you think it is signed, address space can
be extended naturally.

		-2**63	+---------+
		   /	|         |
		 /	|         |
		/	|         |
	       /	|         |
-2GB+---------+ -2**31  |         |
    |  SS     |         |   SS    |
    |         |         |         |  <-- Some system parameters stay here.
  0 +=========+       0 +=========+      (such as reset vector.)
    |         |         |         |
    |  US     |         |         |
+2GB+---------+ 2**31-1 |   US    |
	       \    	|	  |
		\	|         |
		 \	|         |
		   \	|         |
		2**63-1	+---------+

-- 
Artifitial Intelligence Division
Fujitsu Laboratories LTD., Kawasaki, Japan.
Masanobu YUHARA
kddlab!yuhara%flab.flab.fujitsu.junet@uunet.UU.NET

tainter@ihlpg.ATT.COM (Tainter) (05/17/88)

In article <965@cresswell.quintus.UUCP>, ok@quintus.UUCP (Richard A. O'Keefe) writes:
> I don't imagine that Ed Nather likes it when his counts
> silently wrap around from 65535 to 0, but that's "unsigned" arithmetic
> for you. 

> (2) I am getting sick of computers which cannot do integer arithmetic
>     and won't admit their mistakes.  Floating-point was bad enough,
>     but when a computer will add 1 to a positive number and give me
>     a negative number it's time we cleaned up our act.  

It can also give you overflow if your language allows for detecting it.
Don't blaim the machine!  If you need to detect this and your language
doesn't allow it then you need a different language.

Also, There is nothing stopping the implementers from making unsigned arithmetic
(with a loss of a bit of precision) out of signed numbers.  One simply detects
the sign change and zeros out the value.  Ta da, unsigned arithmetic the way
it is defined now.

I wouldn't quibble about that extra bit either, no matter how many bits of
width you give your integers, there is someone who needs more.  Currently,
31 bits is probably just as sufficient as 32 bits.  47 will probably do just
as well as 48, 63 as 64, etc.

--j.a.tainter

alan@pdn.UUCP (Alan Lovejoy) (05/17/88)

In article <1988May15.220044.12987@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>...  As people have already pointed out, nasty surprises lurk in unsigned
>arithmetic, and it is potentially less efficient to boot (although on most
>machines the difference, if any, is slight). ...

Those are interesting assertions.  I, for one, would like to see the 
justification(s) for them.  Specifically, what are the "nasty surprises"
hiding in unsigned arithmetic that do not also exist for signed
arithmetic AS IT IS COMMONLY IMPLEMENTED IN HARDWARE?  Why should
signed arithmetic be more efficient than unsigned?

--just curious--

-- 
Alan Lovejoy; alan@pdn; 813-530-8241; Paradyne Corporation: Largo, Florida.
Disclaimer: Do not confuse my views with the official views of Paradyne
            Corporation (regardless of how confusing those views may be).
Motto: Never put off to run-time what you can do at compile-time!

davet@oakhill.UUCP (David Trissel) (05/17/88)

In article <11592@ut-sally.UUCP> nather@ut-sally.UUCP (Ed Nather) writes:

>Now, if we designed computers so integer word sizes were large enough to
>hold the largest number we now use in floating point (ca. 2^512 or so)
>then we wouldn't need a complex floating point system -- just good,
>fast (wide) integer operations.  And I could count events without
>constantly looking over my shoulder for problems.

Having worked in the past as a systems programmer at both large commercial
as well as number-crunching installations I always found it interesting that 
COBOL was the only language I knew of which mandated support for an extended
integer data type of more than 32 bits.  (Out of curiousity I just HAD to
look at the source code for the 64 bit integer square root routines!  Yes
COBOL does support sqrt(), believe it or not.) 

That was about 10 years ago and the only thing I remember is that 
the COBOL spec required that this integer data type be able to exactly 
represent at least 19 (or was it 18?) decimal digits of precision.

It's obvious financial arithmetic requires precise results.  But I
never understood why the other languages favored by the number crunching folks
never supported a larger integer type.  Is there really no use for this?

 -- Dave Trissel  ut-sally!im4u!oakhill!davet

lisper-bjorn@CS.YALE.EDU (Bjorn Lisper) (05/17/88)

In article <493@cmx.npac.syr.edu> billo@cmx.npac.syr.edu (Bill O'Farrell)
writes:
(Ed Nather writes about the problems with counting photons from stars)
>>Unfortunatly I have found no simple way to get the star to cooperate with
>>respect to counting rates.  Sometimes 16 bits are enough, sometimes 32
>>aren't, depending on the the star's brightness and the rapidity with which
>>it varies.
>
>This is a perfect example of why we need higher-level languages
>like lisp.  In lisp, you don't need to know the ranges of 
>integers ahead of time. If your calculations overflow the
>hardware representation of an interger, lisp just converts
>the representation to Bignum, and works with it instead --
>you may never even be aware that the conversion has occurred.

A lisp implementation has other problems. Counting photons is a real-time
application and lisp is not very well equipped to deal with this. What if
the lisp interpreter decides to do a garbage collection just when a bunch of
photons are coming? Hiding representations is a nice idea but this is a
particular application where one must have close control of the
representation because of the real-time constraints.

>Yes, we need languages that are close to hardware (I don't think I'd
>be able to write an operating system for, say, an IBM 370, in lisp.)
>But much of programming could be made easier if we used (and designed)
>languages more suited to the framing of algorithms rather than to the
>writing of programs which run fast on particular pieces of hardware.
>
>In fact, using such languages has the beneficial effect of encouraging
>the design of hardware better suited to higher-level language
>implementation -- example: lisp machines.

When we are at the subject, has anyone heard anything lately about the
Japanese fifth-generation computer project related ideas to have extensive
hardware support for prolog and use it as a machine language? There was a
lot of talk about this back in -83 but since then I haven't heard too much.
This is another idea I've never believed in.

Bjorn Lisper

mangler@cit-vax.Caltech.Edu (Don Speck) (05/17/88)

One should use unsigned numbers when mixing arithmetic and logical
operators, e.g. division by shifting right, modulus by masking, etc.

There are certain optimizations along those lines that are only safe
if the compiler can be sure that the number cannot be negative
(stemming largely from the convention of rounding towards zero).

Array indices can be range-checked with a single unsigned compare.

Don Speck   speck@vlsi.caltech.edu  {amdahl,ames!elroy}!cit-vax!speck

andrew@frip.gwd.tek.com (Andrew Klossner) (05/18/88)

Doug Gwyn (gwyn@brl-smoke.ARPA) writes:

>> Unfortunately, it is a real problem, because there are zillions of
>> programs that implicitly assume that [null] pointers are all-zeros.

> I don't think this is true.  How about an example?

Sure Doug, from the system V kernel that you defend so ardently :-),
file io/tt1.c (vanilla release 3.1):

In routine ttout:

		if (tbuf->c_ptr)

appears twice.  (And in the same routine,

		if (tbuf->c_ptr == NULL)

appears twice.  Multiple hackers have clogged through here.)

In routine ttioctl:

		if (tp->t_rbuf.c_ptr) {
		if (tp->t_tbuf.c_ptr) {

The C standards I've seen so far are pretty clear in stating that the
conditional is compared against zero.  There doesn't seem to be leeway
to define pointer comparisons to be against some non-zero NULL value.

  -=- Andrew Klossner   (decvax!tektronix!tekecs!andrew)       [UUCP]
                        (andrew%tekecs.tek.com@relay.cs.net)   [ARPA]

bcase@Apple.COM (Brian Case) (05/18/88)

In article <1988May15.220044.12987@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>As people have already pointed out, nasty surprises lurk in unsigned
>arithmetic, and it is potentially less efficient to boot (although on most

Now, wait:  how does unsigned arithmetic make it less efficient to boot
the machine?  :-) :-) :-) :-)

JUST KIDDING!  IT'S A JOKE, SON.

nather@ut-sally.UUCP (Ed Nather) (05/18/88)

In article <6575@cit-vax.Caltech.Edu>, mangler@cit-vax.Caltech.Edu (Don Speck) writes:
> One should use unsigned numbers when mixing arithmetic and logical
> operators, e.g. division by shifting right, modulus by masking, etc.
> 

Indeed.  Most of the time-critical operations that must be very fast make
use of these, and other, "tricks."  When time is of the essence even an
integer multiply (rarely needed) can be too costly.  Shift-and-add
operations between two registers can multiply by 10, or 40, or such small
integers as special ceses faster than most built-in multiply operations.
One obvious problem: the multiplier is not explicit, it is implicit in
the actual operations used.  Comments help.

> There are certain optimizations along those lines that are only safe
> if the compiler can be sure that the number cannot be negative
> (stemming largely from the convention of rounding towards zero).
> 
> Array indices can be range-checked with a single unsigned compare.
> 

Has anyone considered a method for telling the compiler that numbers can
only be positive, to make use of these operations?  I thought "unsigned"
did that, but maybe I'm wrong.  If I promise not to use negative integers
anywhere, even as array indices, can you generate faster code for me?

If I had that, and C had a way to keep track of the carry bit, I
wouldn't need assembly code at all.

Well, hardly ever.

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.AS.UTEXAS.EDU

mouse@mcgill-vision.UUCP (der Mouse) (05/18/88)

In article <10001@tekecs.TEK.COM>, andrew@frip.gwd.tek.com (Andrew Klossner) writes:
> Doug Gwyn (gwyn@brl-smoke.ARPA) writes:
[an attribution appears to have been lost, but presumably it's our
friend Andrew Klossner]
>>> Unfortunately, it is a real problem, because there are zillions of
>>> programs that implicitly assume that [null] pointers are all-zeros.
>> I don't think this is true.  How about an example?
> Sure Doug, from the system V kernel that you defend so ardently :-),
> file io/tt1.c (vanilla release 3.1):
> 		if (tbuf->c_ptr)
> 		if (tbuf->c_ptr == NULL)
> 		if (tp->t_rbuf.c_ptr) {
> 		if (tp->t_tbuf.c_ptr) {

> The C standards I've seen so far are pretty clear in stating that the
> conditional is compared against zero.  There doesn't seem to be
> leeway to define pointer comparisons to be against some non-zero NULL
> value.

But when a pointer is compared against the integer constant zero,
either explicitly (second example) or implicitly (other three
examples), the zero is cast to the appropriate pointer type, producing
whatever bit pattern is appropriate for a null pointer of that type.
(Similar things happen when assigning the integer constant zero to a
pointer.  Note that "integer constant zero" is not the same thing as
"integer expression with value zero".)  This was true in K&R and
remains true in the dpANS.

					der Mouse

			uucp: mouse@mcgill-vision.uucp
			arpa: mouse@larry.mcrcim.mcgill.edu

faustus@ic.Berkeley.EDU (Wayne A. Christopher) (05/18/88)

In article <10001@tekecs.TEK.COM>, andrew@frip.gwd.tek.com (Andrew Klossner) writes:
> >> Unfortunately, it is a real problem, because there are zillions of
> >> programs that implicitly assume that [null] pointers are all-zeros.
> 
> 		if (tbuf->c_ptr)

The trick here is that whenever a pointer is converted into an
integer (as here), the NULL pointer must be converted to the integer
0.  It doesn't matter what the bit pattern is before conversion.
Otherwise, as you say, the world would be swallowed up by huge
tidal waves and the sun would fall from the sky.  Are there any
implementations of C that use a non-0 bit pattern?  I pity the
compiler writer...

	Wayne

gwyn@brl-smoke.ARPA (Doug Gwyn ) (05/18/88)

In article <10001@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
-Doug Gwyn (gwyn@brl-smoke.ARPA) writes:
->> Unfortunately, it is a real problem, because there are zillions of
->> programs that implicitly assume that [null] pointers are all-zeros.
-> I don't think this is true.  How about an example?
-		if (tbuf->c_ptr)
-		if (tbuf->c_ptr == NULL)
-		if (tp->t_rbuf.c_ptr) {
-		if (tp->t_tbuf.c_ptr) {

None of these are non-portable uses of C; none of them depend on
null pointers being represented by all-zero data.

I'm still waiting for an example...

tim@amdcad.AMD.COM (Tim Olson) (05/18/88)

In article <10001@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
| Doug Gwyn (gwyn@brl-smoke.ARPA) writes:
| 
| >> Unfortunately, it is a real problem, because there are zillions of
| >> programs that implicitly assume that [null] pointers are all-zeros.
| 
| > I don't think this is true.  How about an example?
| 
| Sure Doug, from the system V kernel that you defend so ardently :-),
| file io/tt1.c (vanilla release 3.1):
| 
| In routine ttout:
| 
| 		if (tbuf->c_ptr)
| 
| appears twice.  (And in the same routine,
| 
| 		if (tbuf->c_ptr == NULL)
| 
| appears twice.  Multiple hackers have clogged through here.)
| 
| In routine ttioctl:
| 
| 		if (tp->t_rbuf.c_ptr) {
| 		if (tp->t_tbuf.c_ptr) {
| 
| The C standards I've seen so far are pretty clear in stating that the
| conditional is compared against zero.  There doesn't seem to be leeway
| to define pointer comparisons to be against some non-zero NULL value.

"NULL" wasn't being discussed, it was the internal representation of null
pointers (this seems to cause so much confusion -- how about calling the
latter a different name, like "nil"?).

As has been stated in comp.lang.c numerous times: in C, nil can be any
bit pattern, as long as it is guaranteed not to ever point to valid
data.  NULL must be 0 (or perhaps (void *)0 under ANSI).  The compiler
takes care of the appropriate conversions between NULL and nil.  The
above code is correct C.

	-- Tim Olson
	Advanced Micro Devices
	(tim@amdcad.amd.com)

sarima@gryphon.CTS.COM (Stan Friesen) (05/19/88)

In article <10001@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
>Doug Gwyn (gwyn@brl-smoke.ARPA) writes:
>
>> I don't think this is true.  How about an example?
[[Of code assuming the null pointer is all zero bits]]
>
>Sure Doug, from the system V kernel that you defend so ardently :-),
>file io/tt1.c (vanilla release 3.1):
>
>		if (tbuf->c_ptr)
>
>		if (tbuf->c_ptr == NULL)
>
>		if (tp->t_rbuf.c_ptr) {
>		if (tp->t_tbuf.c_ptr) {
>
>The C standards I've seen so far are pretty clear in stating that the
>conditional is compared against zero.  There doesn't seem to be leeway
>to define pointer comparisons to be against some non-zero NULL value.

	Yes, but they ALSO require that comparing a NULL-pointer to zero
evaluate to true *whatever* the representation of the NULL-pointer. The
compiler is *also* required to convert the integer constant 0 to the
NULL-pointer on assignment. So *none* of the above examples assume anything
about the representation of the NULL-pointer, they are all strictly
conforming. There *are* cases of code that does make such assumptions.
They all have the following general form:

func1(p)
char *p;
{
	/* stuff */
}

...

func2()
{
	...
	func1(0);
}

	In this example the code assumes both the representation *and* the
size of NULL-pointer. This code is *not* portable even among existing
compilers. Nor is it even conforming, let alone strictly so. Any code of
this form only works accidentally and needs to be fixed anyway.
-- 
Sarima Cardolandion			sarima@gryphon.CTS.COM
aka Stanley Friesen			rutgers!marque!gryphon!sarima
					Sherman Oaks, CA

henry@utzoo.uucp (Henry Spencer) (05/20/88)

>		if (tp->t_tbuf.c_ptr) {
>The C standards I've seen so far are pretty clear in stating that the
>conditional is compared against zero.  There doesn't seem to be leeway
>to define pointer comparisons to be against some non-zero NULL value.

Sigh.  Not this again!  If you *read* the fine print in the standards,
you will find that this construct is 100.00000% equivalent to saying
"if (tp->t_tbuf.c_ptr != NULL) {".  In pointer contexts, which this
is, the integer constant 0 stands for "whatever bit pattern is used for
null pointers".  When p is a pointer, "if (p)", "if (p != 0)" and
"if (p != NULL)" are completely synonymous, by the definition of C.

The problem that does come up is that compilers in general cannot tell
whether a parameter in a function call is meant to be a pointer or not,
and hence cannot supply the automatic conversion.  There is also trouble
with programs that explicitly convert pointers to integers and back.
-- 
NASA is to spaceflight as            |  Henry Spencer @ U of Toronto Zoology
the Post Office is to mail.          | {ihnp4,decvax,uunet!mnetor}!utzoo!henry

phil@osiris.UUCP (Philip Kos) (05/20/88)

In article <4086@gryphon.CTS.COM>, sarima@gryphon.CTS.COM (Stan Friesen) writes:
> In article <10001@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
> > There doesn't seem to be leeway
> >to define pointer comparisons to be against some non-zero NULL value.
> 
> 	Yes, but they ALSO require that comparing a NULL-pointer to zero
> evaluate to true *whatever* the representation of the NULL-pointer....

Please be careful not to confuse null pointers with NULL pointers.  Null
pointers have a formal definition within the language, but NULL pointers
don't really; NULL is just a convention and not part of the language spec.
("We write NULL instead of zero, however, to indicate more clearly that
this is a special value for a pointer", K&R first edition, pp. 97-98; "The
symbolic constant NULL is often used in place of zero, as a mnemonic to
indicate more clearly that this is a special value for a pointer", K&R
second edition, p. 102.)

I've also seen NULL defined as (char *) 0, by the way...

                                                                 Phil Kos
                                                      Information Systems
...!uunet!pyrdc!osiris!phil                    The Johns Hopkins Hospital
                                                            Baltimore, MD

radford@calgary.UUCP (Radford Neal) (05/20/88)

In article <10001@tekecs.TEK.COM>, andrew@frip.gwd.tek.com (Andrew Klossner) writes:

> >> Unfortunately, it is a real problem, because there are zillions of
> >> programs that implicitly assume that [null] pointers are all-zeros.
> 
> > I don't think this is true.  How about an example?
> 
> Sure Doug, from the system V kernel... In routine ttout:
> 
> 		if (tbuf->c_ptr)

My understanding is that this is standards-conforming and portable.
I assume that c_ptr is declared to be of pointer type. The above 
statement is equivalent to

		if (tbuf->c_ptr!=0)

which is equivalent to

		if (tbuf->c_ptr!=(char*)0)

(or (int*)0 or whatever). The expression (char*)0 is _defined_ to be
the null pointer, whatever its bit pattern may be. Note that "NULL"
has nothing to do with anything, being merely a macro found in the
<stdio.h> include file.

An example of a non-portable program is the following:

    char *p;
    int i;
    ...
    i = (int)p;
    if (i!=0)
       *p = ...; /* no guarantee that p is not null... */

Only occurences of 0 in the explicit or implicit context (...*) 0
are magically converted to null pointers.

    Radford Neal

barnett@vdsvax.steinmetz.ge.com (Bruce G. Barnett) (05/20/88)

In article <2393@uvacs.CS.VIRGINIA.EDU> wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
|Has anyone ever seen a machine with "negative addresses"?

I believe the Sun 80386 RoadRunner has negative addresses, once you get
beyond 2 Gigabytes ( assuming you use signed integers).

	Drat! There goes my nifty sort algorithm. 

--
 :-)
-- 
	Bruce G. Barnett 	<barnett@ge-crd.ARPA> <barnett@steinmetz.UUCP>
				uunet!steinmetz!barnett

anc@camcon.uucp (Adrian Cockcroft) (05/20/88)

In article <4000@ayumi.stars.flab.fujitsu.JUNET>, yuhara@ayumi.stars.flab.fujitsu.JUNET (== M. Yuhara ==) writes:
> In article <2393@uvacs.CS.VIRGINIA.EDU>, wulf@uvacs.CS.VIRGINIA.EDU (Bill Wulf) writes:
> > Has anyone ever seen a machine with "negative addresses", that is, one
> > where the address space is -2**31..2**31-1 rather than 0..2*32-1??
> 
> Yes, yes.
> TRON chip deals an address as an signed integer.

The Inmos Transputer family also has a signed address space. The top of
the address space is at 7FFFFFFF which is where it boots from ROM, the
bottom of the address space is at 80000000 which is where the on-chip RAM
and memory mapped link engines sit. It has a special instruction "mint" for
doing a quick load of the minimum integer (80000000). Because addresses are
signed you can use normal integer comparison instructions so the instruction
set is simplified. The code generated is usually totally position
independent (the instruction set is designed that way) so absolute addresses
are only needed for talking to memory mapped hardware.

-- 
  |   Adrian Cockcroft anc@camcon.uucp  ..!seismo!mcvax!ukc!camcon!anc
-[T]- Cambridge Consultants Ltd, Science Park, Cambridge CB4 4DW,
  |   England, UK                                        (0223) 358855
      (You are in a maze of twisty little C004's, all alike...)

woerz@iaoobelix.UUCP (Dieter Woerz) (05/20/88)

In article <10001@tekecs.TEK.COM> andrew@frip.gwd.tek.com (Andrew Klossner) writes:
> ...
>In routine ttout:
>
>		if (tbuf->c_ptr)
>
>appears twice.  (And in the same routine,
>
>		if (tbuf->c_ptr == NULL)
>
>appears twice.  Multiple hackers have clogged through here.)
>
>In routine ttioctl:
>
>		if (tp->t_rbuf.c_ptr) {
>		if (tp->t_tbuf.c_ptr) {
> ...

I have to admit, that the others may not work, but I think you should
be able to tweak the compilers for that architecture to do a
comparision of such pointers with the Zero-Pointer of that
architecture, which is not necessaryly zero.

The second example is should work simply if you redefine NULL to the
value of the Zero-Pointer.

------------------------------------------------------------------------------

Dieter Woerz
Fraunhofer Institut fuer Arbeitswirtschaft und Organisation
Abt. 453
Holzgartenstrasse 17
D-7000 Stuttgart 1
W-Germany

BITNET: iaoobel.uucp!woerz@unido.bitnet
UUCP:   ...{uunet!unido, pyramid}!iaoobel!woerz

flaps@dgp.toronto.edu (Alan J Rosenthal) (05/22/88)

Henry Spencer wrote:
>>Unfortunately, it is a real problem, because there are zillions of
>>programs that implicitly assume that pointers are all-zeros.

Doug Gwyn replied:
>I don't think this is true.  How about an example?

Later, he wrote that he was still waiting for an example, so I'll provide one.

A large project on which I am currently working has many segments in
which lists of things are manipulated; to a large extent mostly for
displaying in menus, but also for other standard data processing kinds
of tasks.  There is a standardised doubly-linked list representation,
and corresponding routines.  The caller of these routines has as its
representation of the list a "head" which contains header-like
information for the list.

When I first tried to use these routines, I looked through and found
out how to do various operations.  The operation I could not find was
how to initialise a doubly-linked list after having declared the head.
It turned out that a correct initialisation was to set the three
pointers in a struct dll_head all to NULL.  Since existing code usually
happened to declare the head as either global or file static most
people forgot to bother to initialise the head.  When one was declared
as auto, people called zero((char *)&thing,sizeof(struct dll_head)),
zero() being a function which sets a region of memory to zero bits.

So there's your example.

[We have since added an initialisation function!]

ajr

--
- Any questions?
- Well, I thought I had some questions, but they turned out to be a trigraph.

henry@utzoo.uucp (Henry Spencer) (05/22/88)

> ... what are the "nasty surprises"
> hiding in unsigned arithmetic that do not also exist for signed
> arithmetic AS IT IS COMMONLY IMPLEMENTED IN HARDWARE?

Well, for example, consider that a+b>c does not imply a>c-b in unsigned
arithmetic.  (To make this more obvious, consider that b>c does not imply
c-b<0, since no unsigned number is less than zero.)  Remember too that one
unsigned number in a calculation tends to make the whole calculation be
done unsigned, by C rules, sometimes unexpectedly.

> Why should signed arithmetic be more efficient than unsigned?

Because the hardware sometimes supports it rather better.  On the machine
I'm typing this on, for example, unsigned multiplication or division is
significantly slower than the signed forms, because the hardware multiply
and divide instructions are signed-only.

henry@utzoo.uucp (Henry Spencer) (05/22/88)

If I were implementing a C compiler for a 32-bit machine, I would at least
consider the notion of making "long" 64 bits.  It would probably break a
depressing amount of code, but it would have its uses.  (NB this is also
a reason for having unsigned arithmetic in the machine, since it makes
multiprecision arithmetic easier, as I recall.)

bill@proxftl.UUCP (T. William Wells) (05/23/88)

In article <10001@tekecs.TEK.COM>, andrew@frip.gwd.tek.com (Andrew Klossner) writes:
> Doug Gwyn (gwyn@brl-smoke.ARPA) writes:
>
> >> Unfortunately, it is a real problem, because there are zillions of
> >> programs that implicitly assume that [null] pointers are all-zeros.
>
> > I don't think this is true.  How about an example?
>
> Sure Doug, from the system V kernel that you defend so ardently :-),
> file io/tt1.c (vanilla release 3.1):
>
> In routine ttout:
>
>               if (tbuf->c_ptr)
>
> appears twice.  (And in the same routine,
>
>               if (tbuf->c_ptr == NULL)

ANSI says that the two are equivalent.  Actually, ANSI says
(about `if'): "...  the first substatement is executed if the
expression compares unequal to 0.  ...".  This means that you can
think of the statement `if (x)' as `if (x != 0)'.

Note that ANSI only insists that `pointer == 0' be true if and
only if pointer is a null pointer; it makes no requirements that
the pointer actually contain any zeros.

For example, on an 8086, you could define a null pointer as one
with a zero (or any other value) offset.  An implicit or explicit
compare of the pointer to zero would then check only the offset.

An interesting point: ANSI does not define (at least not anywhere
I can find it) the result of `x == y' when x and y are both null
pointers.  Actually, a literal reading of the standard implies
that this would compare false!  Here is my reasoning.  The
standard says that "If two pointers ...  compare equal, they
point to the same object." Since a null pointer does not point to
ANY object, comparing anything to a null pointer should return
false.

I hope that this is an oversight.

alan@pdn.UUCP (Alan Lovejoy) (05/23/88)

In article <1988May22.020336.17472@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
/> ... what are the "nasty surprises"
/> hiding in unsigned arithmetic that do not also exist for signed
/> arithmetic AS IT IS COMMONLY IMPLEMENTED IN HARDWARE?

>Well, for example, consider that a+b>c does not imply a>c-b in unsigned
>arithmetic.  (To make this more obvious, consider that b>c does not imply
>c-b<0, since no unsigned number is less than zero.)  Remember too that one
>unsigned number in a calculation tends to make the whole calculation be
>done unsigned, by C rules, sometimes unexpectedly.

But "a + b > c" does not imply that "a > c - b" using signed arithmetic either,
because undeflow is still possible!  For example: "2 + 1 > -32768" does 
not imply that "2 > -32768 - 1", because for 16 bit integers, "-32768 - 1"
is 32767.

/> Why should signed arithmetic be more efficient than unsigned?

>Because the hardware sometimes supports it rather better.  On the machine
>I'm typing this on, for example, unsigned multiplication or division is
>significantly slower than the signed forms, because the hardware multiply
>and divide instructions are signed-only.

This argument might be valid if we were discussing what sort of
arithmetic to use on your machine.  But the subject is what sort of
arithmetic to design into new machines.  This is comp.arch, not
comp.sys.yourmachine.

Unsigned arithmetic is just an efficient range checking mechanism.
Ranges with a lower bound of zero are quite common, and it makes sense
to support them in the hardware.
-- 
Alan Lovejoy; alan@pdn; 813-530-8241; Paradyne Corporation: Largo, Florida.
Disclaimer: Do not confuse my views with the official views of Paradyne
            Corporation (regardless of how confusing those views may be).
Motto: Never put off to run-time what you can do at compile-time!

paul@unisoft.UUCP (n) (05/23/88)

In article <206@proxftl.UUCP> bill@proxftl.UUCP (T. William Wells) writes:
>>
>>               if (tbuf->c_ptr == NULL)
>
>ANSI says that the two are equivalent.  Actually, ANSI says
>(about `if'): "...  the first substatement is executed if the
>expression compares unequal to 0.  ...".  This means that you can
>think of the statement `if (x)' as `if (x != 0)'.
>

Correct me if I'm wrong .....

			   if (x) ...

	really means 	   if (x != 0) ....
	which really means if ((x != 0) != 0) ...
	which really means if (((x != 0) != 0) != 0) ...
	which really means if ((((x != 0) != 0) != 0) != 0) ...

				etc etc


hence the need for all these new super optimising compilers .....


		Paul

	
-- 
Paul Campbell, UniSoft Corp. 6121 Hollis, Emeryville, Ca
	E-mail:		..!{ucbvax,hoptoad}!unisoft!paul  
Nothing here represents the opinions of UniSoft or its employees (except me)
"Nuclear war doesn't prove who's Right, just who's Left" (ABC news 10/13/87)

djones@megatest.UUCP (Dave Jones) (05/24/88)

in article <959@unisoft.UUCP}, paul@unisoft.UUCP (n) says:

} 			   if (x) ...
} 
} 	really means 	   if (x != 0) ....
} 	which really means if ((x != 0) != 0) ...
} 	which really means if (((x != 0) != 0) != 0) ...
} 	which really means if ((((x != 0) != 0) != 0) != 0) ...
} 
} 				etc etc
} 
} 
} hence the need for all these new super optimising compilers .....
} 


Love it.  You made my day.

Have you ever put two packages together, and cpp says "FALSE redefined"?

That means that not one, but TWO, count 'em, TWO .h files have a
macro like

#define FALSE (0!=0)

and they aren't the same.

In the code, you're likely to see 

	if( x == FALSE )

which translates to

	if( x == (0!=0) )

which, in C,  ain't the same as if( !x ).  I call this the
"Law of the Included Muddle".

Now if they had just written the macro that way, we'd have this:

#define FALSE ((0!=0)!=FALSE)

Now we've *really* got something for your optimizing compilers to crunch on.

  if(x != ((0!=0)!=((0!=0)!=((0!=0)!= ... 

I wonder if the TRUE-believers figure that someday they'll need to redefine
TRUE to -- oh let's see -- maybe 42?



		-- Dave J.

bill@proxftl.UUCP (T. William Wells) (05/28/88)

In article <4086@gryphon.CTS.COM>, sarima@gryphon.CTS.COM (Stan Friesen) writes:
)             There *are* cases of code that does make such assumptions.
) They all have the following general form:
)
) func1(p)
) char *p;
) {
)       /* stuff */
) }
)
) ...
)
) func2()
) {
)       ...
)       func1(0);
) }
)
)       In this example the code assumes both the representation *and* the
) size of NULL-pointer. This code is *not* portable even among existing
) compilers. Nor is it even conforming, let alone strictly so. Any code of
) this form only works accidentally and needs to be fixed anyway.
) --
) Sarima Cardolandion                   sarima@gryphon.CTS.COM
) aka Stanley Friesen                   rutgers!marque!gryphon!sarima

Actually, if you are using Standard C, declare func1 with a prototype
and the problem goes away. Prototypes were invented to (among other
things) solve this kind of problem.

Perhaps this is what you intended to say?

henry@utzoo.uucp (Henry Spencer) (06/01/88)

> ... undeflow is still possible!

Underflow/overflow is possible in both signed and unsigned arithmetic,
but my experience is that people are much less likely to think about it
for unsigned arithmetic.  They're used to the idea of magnitude limits,
but not used to said limits not being roughly symmetrical around zero.

> This argument might be valid if we were discussing what sort of
> arithmetic to use on your machine.  But the subject is what sort of
> arithmetic to design into new machines...

The specific question I was answering was not that tightly phrased.
-- 
"For perfect safety... sit on a fence|  Henry Spencer @ U of Toronto Zoology
and watch the birds." --Wilbur Wright| {ihnp4,decvax,uunet!mnetor}!utzoo!henry

ericb@athertn.Atherton.COM (Eric Black) (06/02/88)

In article <8805220452.AA14606@explorer.dgp.toronto.edu> flaps@dgp.toronto.edu (Alan J Rosenthal) writes:
>
>Henry Spencer wrote:
>>>Unfortunately, it is a real problem, because there are zillions of
>>>programs that implicitly assume that pointers are all-zeros.
>
>Doug Gwyn replied:
>>I don't think this is true.  How about an example?
>
>Later, he wrote that he was still waiting for an example, so I'll provide one.
> [...description of linked list of nodes pointing to other nodes...]
>people forgot to bother to initialise the head.  When one was declared
>as auto, people called zero((char *)&thing,sizeof(struct dll_head)),
>zero() being a function which sets a region of memory to zero bits.
>
>So there's your example.

A wonderful example of non-portable code!  Essentially what you are doing
without making it explicit is punning the pointer, just as if you had
something like:
	union {
		long	ima_number;
		char	*ima_pointer;
	};
and set the bits via one union member, and looked at them via the other.

There are also "zillions of programs" that assume the order of characters
in a long, and break when moved from a VAX to a 68K, or other analogous move.

Such code should be punishable by forcing the programmer to port C programs
running under UNIX to run under PRIMOS.  (no :-)

>
>[We have since added an initialisation function!]
>
>ajr

Huzzah!  What happens now when people "forget to bother to initialise the
head"??  Buggy code is an existence proof for buggy code...  A non-portable
"safety net" for programmers of said buggy code doesn't seem to me to be
a whole lot different than device drivers that assume that all device
control and status registers look exactly like the CSR on Unibus devices;
both might be perfectly valid in the environment they assume, but are
quite wrong when taken out of that environment.

Note that such assumptions are not just machine-dependent; they can also
be compiler-dependent!

I hope there was a :-) truncated by NNTP in your article...

:-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-):-)

-- 
Eric Black	"Garbage in, Gospel out"
Atherton Technology, 1333 Bordeaux Dr., Sunnyvale, CA, 94089
   UUCP:	{sun,decwrl,hpda,pyramid}!athertn!ericb
   Domainist:	ericb@Atherton.COM

bob+@andrew.cmu.edu (Bob Sidebotham) (06/17/88)

> *Excerpts from magazines.software.z: 18-May-88 Re: negative addresses Tim*
> *Olson@amdcad.AMD.COM (1494)*

> As has been stated in comp.lang.c numerous times: in C, nil can be any
> bit pattern, as long as it is guaranteed not to ever point to valid
> data.  NULL must be 0 (or perhaps (void *)0 under ANSI).  The compiler
> takes care of the appropriate conversions between NULL and nil.  The
> above code is correct C.

>       -- Tim Olson
>       Advanced Mic

I haven't been following this discussion, but I'll add my two-bits worth
anyway:  my current practice, which is apparently not legal C, is to zero data
structures after allocating them, to guarantee the structure is in a reasonable
state.  This works well for most data types, and, I thought, for pointers.

For the moment, I will still consider this a reasonable practice, despite the
fact that it may not work on some obscure machines:  on the machines I work
with, it provides me with a safe way to initialize a data structure which is_
__immune to changes in the definition of the structure._  If I explicitly
initialize all of the fields of a structure, someone will later add a field
without remembering to add the the corresponding initializing code.

It would be preferable to have a C built-in that could be used to intitialize
all of a structure's components to "zero" values, and even more preferable if
hardware manufacturers, operating system builders, compiler writers, and, of
course, language specification writers, all recognized that 0 really is a
_very_ special value.

Bob Sidebotham
P.S. The formatting of this note for the net is beyond my control...